Closed guotong1988 closed 3 years ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@ymcui THX!
你指的字级别增加预训练时间和随机性
很难达到,况且本身算力就是很大的瓶颈,不宜用完全随机性换取有效性。WWM每次都是选取整词进行处理,如果完全依靠随机性,一个句子中恰好所有被mask的位置都是整词的概率会很低。
@ymcui 多谢!