Setting --average_embeddings will average the embeddings over all tokens. If this option is not used (the default), only the output embeddings at the chosen token position (specified by --trainable_token_pos) are considered; for example, the embeddings of the last token. Enabling --average_embeddings will mean-pool the embeddings of all tokens into the position chosen by --trainable_token_pos (the last token by default). As we can see, this improves the performance from 95.00% to 96.33% with only a minimal increase in run time (0.28 min to 0.32 min) and might be worthwhile considering in practice.
Setting
--average_embeddings
will average the embeddings over all tokens. If this option is not used (the default), only the output embeddings at the chosen token position (specified by--trainable_token_pos
) are considered; for example, the embeddings of the last token. Enabling--average_embeddings
will mean-pool the embeddings of all tokens into the position chosen by--trainable_token_pos
(the last token by default). As we can see, this improves the performance from 95.00% to 96.33% with only a minimal increase in run time (0.28 min to 0.32 min) and might be worthwhile considering in practice.