ymcui / Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
https://ieeexplore.ieee.org/document/9599397
Apache License 2.0
9.66k stars 1.39k forks source link

求教:为什么词语要对应多个mask? #197

Closed fangwc closed 3 years ago

fangwc commented 3 years ago

mask 过程中, 现在虽然是词语会被同时mask掉, 但是比如一个词语2个字符现在对应2个mask, 为什么不是词语就是对应一个mask 即使有多个字符? 求教

ymcui commented 3 years ago

如果两个字符对应一个mask,那预测的时候预测什么呢?预测空间,即词表是以字符(WordPiece级别)构成的。

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 3 years ago

Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.