Closed yclzju closed 1 year ago
Thanks for your interest.
The fine-tuning code is based on https://github.com/microsoft/unilm/tree/master/simlm , with minor differences in input format:
For the pre-training part, we currently have no plan to release the collected data, but implementation of the contrastive loss is fairly straightforward.
Liang
Hi, thanks for reply, if I have some dataset such as Chinese dataset, I can use [intfloat/multilingual-e5-base] as initial checkpoint, then finetune based on simlm?
Sure, you can certainly do that.
Hi, would you release code for pretrain and finetune for E5.