ymcui / Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
https://ieeexplore.ieee.org/document/9599397
Apache License 2.0
9.68k stars 1.39k forks source link

关于微调和预测时用的分词器的问题 #125

Closed Jun-Zhang-32108 closed 4 years ago

Jun-Zhang-32108 commented 4 years ago

我理解你们在预训练时用的时LTP分词器。但是我用你的模型微调时也需要用LTP分词器么?我用transfomers这个库调用这个无论是你们的每一个模型的时候,要不就是还是用的和BERT base一样的基于字的分词器,要不就是说找不到模型对应的vocab.json。请问是Bug么?

ymcui commented 4 years ago
  1. 下游任务用WordPiece
  2. https://github.com/ymcui/Chinese-BERT-wwm/issues/122