Closed rfernand2 closed 4 years ago
--model_name_or_path
should be a folder, so you should use just ./output
instead.
Thanks. Verified - that fixed it. Please add a note n the README.md to explain this. Thanks.
Hi, may I ask how did you get these checkpoint files? I tried to specify the path to the checkpoint that is generated by the script during training (containing config.json, optimizer.pt, _pytorchmodel.bin, scheduler.pt, _trainingargs.bin), but I met with a Traceback like this
Traceback (most recent call last):
File "run_language_modeling.py", line 277, in <module>
main()
File "run_language_modeling.py", line 186, in main
tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path, cache_dir=model_args.cache_dir)
File "H:\Anaconda3\envs\env_name\lib\site-packages\transformers\tokenization_auto.py", line 203, in from_pretrained
return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "H:\Anaconda3\envs\env_name\lib\site-packages\transformers\tokenization_utils.py", line 902, in from_pretrained
return cls._from_pretrained(*inputs, **kwargs)
File "H:\Anaconda3\envs\env_name\lib\site-packages\transformers\tokenization_utils.py", line 1007, in _from_pretrained
list(cls.vocab_files_names.values()),
OSError: Model name 'C:\\path-to-ckpt\\checkpoint-17500' was not found in tokenizers model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, bert-base-finnish-cased-v1, bert-base-finnish-uncased-v1, bert-base-dutch-cased). We assumed 'C:\\path-to-ckpt\\checkpoint-17500' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.txt'] but couldn't find such vocabulary files at this path or url.
which technically says that the checkpoint folder misses some other files. I wonder where this mismatch comes from if I used the same script to train.
Those who are new to this issue I just figured it out and save your time 😜😀
What is this error about? ==> When you run the model for the first time it downloads some files { pytorch_model.bin } and if your internet is broken accidentally between processes it will continue running the pipeline file without completely downloading that pytorch_model.bin file so it will raise this issue.
Steps : 1 ] Go to C:// Users / UserName / .cache 2 ] Delete .cache folder 3 ] And Done Just Run The Model Once Again......
You can connect me through @prashantmore999 { Twitter }
🐛 Bug
Information
Model I am using (Bert, XLNet ...): GPT2 Language I am using the model on (English, Chinese ...): English The problem arises when using:
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
This gives an error because "model_name_or_path" is assumed to be a JSON file that contained pretrained model info, not a saved checkpoint file. The error that occurs here is when trying to load the CONFIG file associated with a pretrained model.
I also tried to create a new "model_checkpoint" argument that I then pass into AutoModelWithLMHead.from_pretrained(), but that ends up with a model/checkpoint mismatch (looks like hidden size in checkpoint file =256, but current model=768). In my usage here, I have never changed the hidden size - just did the "do-train" option and it saved my checkpoints to the output directory. And now, I am just trying to verify I can eval on a checkpoint, and then also continue training on a checkpoint.
Expected behavior
I expected to be able to specify an checkpoint_path argument in the run_language_modeling.py that would load the checkpoint file and let me continue training on it and/or evaluate it.
Environment info
transformers
version: 2.9.0