ymcui / Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
https://ieeexplore.ieee.org/document/9599397
Apache License 2.0
9.56k stars 1.38k forks source link

请问RoBERTa-wwm-ext有没有随机初始化MLM部分的权重 #212

Closed dolphin-Jia closed 2 years ago

dolphin-Jia commented 2 years ago

我看到苏剑林的博客中有提到“哈工大开源的RoBERTa-wwm-ext-large则不知道出于什么原因随机初始化了MLM部分的权重”,所以也想请问一下RoBERTa-wwm-ext有没有这么做?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

ymcui commented 2 years ago

公开的版本只包含transformer部分的权重,如需做MLM任务请自行二次预训练task head。 至于说MLM部分被随机初始化,是因为二次保存的时候(为了删掉adam相关参数以减小体积)预训练脚本导致。