Closed drxmy closed 2 years ago
Thanks for your interest! In principle you shouldn't need to do much. You need to switch the English BERT models (incl. the initial bi-encoder and the base model) to Chinese ones. E.g., for bi-encoder you could use simcse-chinese-roberta-wwm-ext (I randomly retrieved this one from google); and the corresponding base model is hfl/chinese-roberta-wwm-ext.
Thank you for replying so quickly. Yes, the model definitely need to change. I was mainly concerned about data processing. I will try it tomorrow.
@drxmy 请问你做过尝试了吗,效果怎么样
@drxmy 请问你做过尝试了吗,效果怎么样
我用的自己的数据集,现在的结果有一点奇怪,可能代码哪里没改对。bi-encoder验证指标过高,后面还会nan。不过最近一直没时间看是什么问题
Or will it work just fine with Chinese? Really interesting work!