shibing624 / text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
https://pypi.org/project/text2vec/
Apache License 2.0
4.39k stars 392 forks source link

关于BGE 微调疑问 #125

Open CuteMing opened 1 year ago

CuteMing commented 1 year ago

Describe the Question

Please provide a clear and concise description of what the question is. 您好,请问您在训练和评估 微调版BGE 时所用的中文STS-B数据集,大概有多少条数据(三元组)呢?

shibing624 commented 1 year ago

数据release了:https://github.com/shibing624/text2vec/blob/master/examples/data/bge_finetune_data.jsonl

样本制作方法:

image
stale[bot] commented 8 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.(由于长期不活动,机器人自动关闭此问题,如果需要欢迎提问)