iflytek / cino

CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)
http://cino.hfl-rc.com
Apache License 2.0
212 stars 28 forks source link

如何在模型的基础上继续训练,比如单语数据? #20

Closed anbo724 closed 2 years ago

anbo724 commented 2 years ago

您好,请教一下如何在模型的基础上继续用某种语言进行训练,比如自有的中文、藏文或者蒙文数据?

airaria commented 2 years ago

用CINO tokenizer分词,正常进行MLM任务训练即可

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.