Pre-training the embedding models

FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs

MIT License

7.68k stars 558 forks source link

Pre-training the embedding models #1169

Open panteaHK opened 3 weeks ago

panteaHK commented 3 weeks ago

I want to first continue pre-training of bge-en-icl model before fine-tuning it. Could you please refer me to an example of how to do that? I think the examples are no longer in your repo.

545999961 commented 3 weeks ago

You can refer to bge-en-icl finetune

panteaHK commented 3 weeks ago

I looked at the examples but it's still unclear to me how to set up the fine-tuning to do MLM or Contrastive Learning?

545999961 commented 3 weeks ago

The bge-en-icl model does not require pre-training; it can be directly fine-tuned.