FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs
MIT License
7.01k stars 512 forks source link

蒸馏数据制作 #832

Open sevenandseven opened 4 months ago

sevenandseven commented 4 months ago

Hello, when I saw fine-tuning the bge-m3 model, I could use distillation. How do I make a distillation dataset?

staoxiao commented 4 months ago

Use a reranker model (e.g., bge-reranker-v2-m3) to compute score for each pair.

sevenandseven commented 4 months ago

Use a reranker model (e.g., bge-reranker-v2-m3) to compute score for each pair.

Could you please provide detailed steps on how to do this?