无法加载chinese-roberta-wwm-ext模型

hanmy1021 commented 4 years ago

无论是在huggingface.co/models上下载了模型加载还是直接用模型名hfl/chinese-roberta-wwm-ext加载，无论是用RobertaTokenizer还是BertTokenizer都会报如下错误： Traceback (most recent call last): File "BERTbaseline_pytorch.py", line 727, in main() File "BERTbaseline_pytorch.py", line 631, in main cache_dir=args.cache_dir if args.cache_dir else None, File "/root/anaconda3/envs/python367/lib/python3.6/site-packages/transformers/tokenization_auto.py", line 197, in from_pretrained return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, kwargs) File "/root/anaconda3/envs/python367/lib/python3.6/site-packages/transformers/tokenization_utils.py", line 393, in from_pretrained return cls._from_pretrained(*inputs, *kwargs) File "/root/anaconda3/envs/python367/lib/python3.6/site-packages/transformers/tokenization_utils.py", line 544, in _from_pretrained tokenizer = cls(init_inputs, init_kwargs) File "/root/anaconda3/envs/python367/lib/python3.6/site-packages/transformers/tokenization_roberta.py", line 149, in init **kwargs, File "/root/anaconda3/envs/python367/lib/python3.6/site-packages/transformers/tokenization_gpt2.py", line 157, in init with open(vocab_file, encoding="utf-8") as vocab_handle: TypeError: expected str, bytes or os.PathLike object, not NoneType

但加载其他模型诸如bert,albert,xlnet都是可以的，请问一下该怎么解决，谢谢。

ymcui commented 4 years ago

你好，刚刚测试了一下是没有问题的。环境：transformers==2.8.0和torch=1.4.0

from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained("hfl/chinese-roberta-wwm-ext")
model = BertModel.from_pretrained("hfl/chinese-roberta-wwm-ext")

LIANGQINGYUAN commented 4 years ago

请问只能是在pytorch环境下可以加载吗，tf2.x环境下是不支持加载的吗？谢谢

ymcui commented 4 years ago

不好意思，关于transformers对tf2.x的支持我不是很清楚。建议去transformers去寻求一下解决方案。

yylun commented 4 years ago

transformer改成自动加载tokenzier导致的，hfl/chinese-roberta-wwm-ext下载下来缓存的配置文件里面model_type写的是roberta，手动改成bert就正常了，毕竟结构仍然是沿用的bert

我也被卡了几次……如果可以的话，麻烦@ymcui 崔老师直接更新下配置文件

ymcui commented 4 years ago

transformer改成自动加载tokenzier导致的，hfl/chinese-roberta-wwm-ext下载下来缓存的配置文件里面model_type写的是roberta，手动改成bert就正常了，毕竟结构仍然是沿用的bert

我也被卡了几次……如果可以的话，麻烦@ymcui 崔老师直接更新下配置文件

你好，我看了一下huggingface应该是自动更新过config。对于，roberta-wwm-ext和roberta-wwm-ext-large模型，现在已经更新model_type为bert了。感谢告知。

ymcui / Chinese-BERT-wwm

无法加载chinese-roberta-wwm-ext模型 #104