"The MNTP LoRA weights are merged into the base model, and the trainable LoRA weights are initialized with SimCSE weights."

McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

MIT License

1.13k stars 83 forks source link

Hi @cultivater,

Thanks for your interest in our work. For simplicity, the train configs released correspond to the best performing models, which does not include SimCSE for supervised contrastive learning.

To get "MNTP+SimCSE" as an initialization point, you will need to merge MNTP weights into the base model separately and provide that model checkpoint address in "model_name_or_path". The SimCSE weights will then be specified in "peft_model_name_or_path".

Hope this clarifies your issue. Let me know if you have any further questions.

McGill-NLP / llm2vec

"The MNTP LoRA weights are merged into the base model, and the trainable LoRA weights are initialized with SimCSE weights." #137