Add CLS pooling, which is frequently used for BERT like model. It extracts the first token hidden states as the embeddings.
Add last token pooling, which is frequently used for recent text embedding models with the decoder architecture, such as https://huggingface.co/intfloat/e5-mistral-7b-instruct. It basically uses the last token states as the embeddings.
CLS
pooling, which is frequently used for BERT like model. It extracts the first token hidden states as the embeddings.