ymcui / Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
https://ieeexplore.ieee.org/document/9599397
Apache License 2.0
9.57k stars 1.38k forks source link

ValueError: Couldn't find 'checkpoint' file or checkpoints in given directory chinese_roberta_wwm_ext_L-12_H-768_A-12 #51

Closed fyubang closed 4 years ago

fyubang commented 4 years ago

ValueError: Couldn't find 'checkpoint' file or checkpoints in given directory chinese_roberta_wwm_ext_L-12_H-768_A-12

尝试用 https://github.com/huggingface/pytorch-transformers/blob/master/pytorch_transformers/convert_tf_checkpoint_to_pytorch.py 这个代码convert roberta的checkpoint,报了这个错,请问是为什么呢?

ymcui commented 4 years ago

应该是你命令行没敲对,具体命令发一下?

ymcui commented 4 years ago

或者直接下载:https://drive.google.com/open?id=1eHM3l4fMo6DsQYGmey7UZGiTmQquHw25

fyubang commented 4 years ago

应该是你命令行没敲对,具体命令发一下?

python convert_tf_to_torch.py --tf_checkpoint_path chinese_roberta_wwm_ext_L-12_H-768_A-12/ --bert_config_file tensorflow/chinese_roberta_wwm_ext_L-12_H-768_A-12/bert_config.json --pytorch_dump_path torch/chinese/

fyubang commented 4 years ago

或者直接下载:https://drive.google.com/open?id=1eHM3l4fMo6DsQYGmey7UZGiTmQquHw25

感恩!直接用原来的bert代码就可以跑是吗?

ymcui commented 4 years ago

是的,原来BERT怎么用这个就怎么用就行。

fyubang commented 4 years ago

是的,原来BERT怎么用这个就怎么用就行。

感谢,用起来了,不过结果很差,差不多70(bert-wwm-ext) vs 45(roberta)。roberta的超参有什么经验性的调整么? roberta原论文里是去掉了token_type的,您这边需要去掉token_type使用吗?

ymcui commented 4 years ago

理论上不需要过多的调整,由于我没在pytorch上试过所以无法保证转换后的model一定是好用的。你也可以尝试一下brightmart的RoBERTa模型,看看是否会有更好的结果。

ymcui commented 4 years ago

这个目录里的RoBERTa和BERT用法一样,token_type是要加的。