How would one have to adjust run_predictions.py to work with a roberta model?
I tried changing the tokenizer to RobertaTokenizer and set the data loader to model_type=roberta but that doesn't seem to change anything. It's still looking for vocab.txt, which doesn't exist.
Didn't find file experiments/ABT_BUY__ROBERTA__20200718_161003/vocab.txt. We won't load it.
loading file None
loading file experiments/ABT_BUY__ROBERTA__20200718_161003/added_tokens.json
loading file experiments/ABT_BUY__ROBERTA__20200718_161003/special_tokens_map.json
loading file experiments/ABT_BUY__ROBERTA__20200718_161003/tokenizer_config.json
Traceback (most recent call last):
File "/home/jonas/Bachelor-Arbeit/entity-matching-transformer/src/run_prediction.py", line 21, in <module>
model, tokenizer = load_model(os.path.join(args.model_output_dir, args.trained_model_for_prediction), args.do_lower_case)
File "/home/jonas/Bachelor-Arbeit/entity-matching-transformer/src/model.py", line 26, in load_model
tokenizer = BertTokenizer.from_pretrained(model_dir, do_lower_case=do_lower_case)
File "/home/jonas/anaconda3/envs/bert/lib/python3.8/site-packages/pytorch_transformers/tokenization_utils.py", line 293, in from_pretrained
return cls._from_pretrained(*inputs, **kwargs)
File "/home/jonas/anaconda3/envs/bert/lib/python3.8/site-packages/pytorch_transformers/tokenization_utils.py", line 421, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/jonas/anaconda3/envs/bert/lib/python3.8/site-packages/pytorch_transformers/tokenization_bert.py", line 149, in __init__
if not os.path.isfile(vocab_file):
File "/home/jonas/anaconda3/envs/bert/lib/python3.8/genericpath.py", line 30, in isfile
st = os.stat(path)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
How would one have to adjust
run_predictions.py
to work with a roberta model? I tried changing the tokenizer toRobertaTokenizer
and set the data loader tomodel_type=roberta
but that doesn't seem to change anything. It's still looking for vocab.txt, which doesn't exist.