lancopku / pkuseg-python

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation
MIT License
6.55k stars 986 forks source link

请问默认模型训练用到了哪些数据集? #152

Open bayesrule opened 3 years ago

bayesrule commented 3 years ago

根据论文,pretraining用到了PKU (news), Weibo (web), and CTB8 (hybrid)三个数据集,那么default model是在这个pretraining之后进一步finetune了吗?还是直接用这个pretraining得到的model?谢谢回答哈!

jesuswa commented 11 months ago

23年回复