Open LLLiHaotian opened 6 months ago
@LLLiHaotian , you need to fine-tune the model on your downstream data, and select the best pretrain ckpt based on the downstream performance.
I only hope to use the encoder part to support representation, and there is no need for downstream tasks for the time being. Therefore, I would like to know how to determine which ckpt has the best effect. Looking forward to your answers.
There is no appropriate metric to evaluate the performance of pre-training task. We recommend selecting the ckpt based on the performance of fine-tuning downstream task.
There is no appropriate metric to evaluate the performance of pre-training task. We recommend selecting the ckpt based on the performance of fine-tuning downstream task.
After pretraining on my task-specific training dataset, what type of data should I use for fine-tuning on the downstream retrieval task? I'm unsure whether to utilize my own downstream data(similar sentence, not too many) for fine-tuning, or to combine a large mount of public STS/retrieval dataset with my own downstream data?
请问这种loss产生偶尔上升的情况是否正常,又该如何判断预训练合适结束?
bge-m3-patent-retromae_batch56_max350.log