OSError: Can't load tokenizer for '../bert/torch_roberta_wwm'

zhishui3 commented 3 years ago

File "D:\Event_Extraction\event_extraction2020\src_final\preprocess\processor.py", line 653, in convert_examples_to_features tokenizer = BertTokenizer.from_pretrained(bert_dir) File "D:\soft\Anaconda3\lib\site-packages\transformers-4.2.2-py3.8.egg\transformers\tokenization_utils_base.py", line 1760, in from_pretrained OSError: Can't load tokenizer for '../bert/torch_roberta_wwm' Make sure that:

'../bert/torch_roberta_wwm' is a correct model identifier listed on 'https://huggingface.co/models'
or '../bert/torch_roberta_wwm' is the correct path to a directory containing relevant tokenizer files

不是已经下载了bert模型吗？直接bert_dir = ‘./bert/’ 也报同样的错，但是 from_pretrained(cls, pretrained_model_name_or_path: Union[str, os.PathLike], *init_inputs, **kwargs):这个方法中 pretrained_model_name_or_path - A path to a directory containing vocabulary files required by the tokenizer, for instance saved using the :meth:~transformers.tokenization_utils_base.PreTrainedTokenizerBase.save_pretrained method, e.g., ./my_model_directory/. 把bert_dir = ‘roberta-large’ （ model identifier listed on 'https://huggingface.co/models'）又报错
File "D:\Event_Extraction\event_extraction2020\src_final\preprocess\processor.py", line 653, in convert_examples_to_features tokenizer = BertTokenizer.from_pretrained(bert_dir) File "D:\soft\Anaconda3\lib\site-packages\transformers-4.2.2-py3.8.egg\transformers\tokenization_utils_base.py", line 1769, in from_pretrained File "D:\soft\Anaconda3\lib\site-packages\transformers-4.2.2-py3.8.egg\transformers\tokenization_utils_base.py", line 1841, in _from_pretrained File "D:\soft\Anaconda3\lib\site-packages\transformers-4.2.2-py3.8.egg\transformers\models\bert\tokenization_bert.py", line 193, in init File "D:\soft\Anaconda3\lib\genericpath.py", line 30, in isfile st = os.stat(path) TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

请问大佬这是什么原因啊

WuHuRestaurant commented 3 years ago

文件夹里没有vocab.txt

zhishui3 commented 3 years ago

有 vocab.txt，后来把相对路径改成绝对路径就可以了，谢谢大佬！

WuHuRestaurant / xf_event_extraction2020Top1

OSError: Can't load tokenizer for '../bert/torch_roberta_wwm' #23