kongds / scaling_sentemb

Scaling Sentence Embeddings with Large Language Models
85 stars 4 forks source link

Is it necessary to add </s>? #8

Closed 545999961 closed 7 months ago

545999961 commented 7 months ago

I noticed that the provided usage examples did not include , and the last one '"' taken as the embedding. However, in the training files, '' was included, and '' was chosen as the embedding. Which one should be considered the primary choice?

kongds commented 7 months ago

Hello, We also don't add on training and set add_eos_token to False in below: https://github.com/kongds/scaling_sentemb/blob/8567aa083c1b3c77586670f91e7f78eb80694ad3/ft_llm.py#L390-L392

545999961 commented 7 months ago

Thanks for you help!