Closed Aniruddha-JU closed 2 years ago
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Environment info
transformers
version: 4.13@patil-suraj Model :mt5-base
input : python run_summarization.py --model_name_or_path google/mt5-base --do_train --do_predict --train_file /home/aniruddha/mt5_data/bengali_8_shot.json --test_file /home/aniruddha/mt5_data/ben_dev.json --source_prefix "summarize: " --output_dir mt5_ben_16_667/ --overwrite_output_dir --per_device_train_batch_size=1 --per_device_eval_batch_size=4 --predict_with_generate --seed 667 --save_steps 14000000 --num_beams 3
error:Training completed. Do not forget to share your model on huggingface.co/models =)
{'train_runtime': 13.5967, 'train_samples_per_second': 1.765, 'train_steps_per_second': 0.883, 'train_loss': 12.063149770100912, 'epoch': 3.0} 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:13<00:00, 1.13s/it] [INFO|trainer.py:1995] 2021-11-06 20:20:47,931 >> Saving model checkpoint to mt5_ben_16_667/ [INFO|configuration_utils.py:417] 2021-11-06 20:20:47,932 >> Configuration saved in mt5_ben_16_667/config.json Traceback (most recent call last): File "run_summarization.py", line 648, in
main()
File "run_summarization.py", line 571, in main
trainer.save_model() # Saves the tokenizer too for easy upload
File "/home/aniruddha/anaconda3/envs/ani/lib/python3.8/site-packages/transformers/trainer.py", line 1961, in save_model
self._save(output_dir)
File "/home/aniruddha/anaconda3/envs/ani/lib/python3.8/site-packages/transformers/trainer.py", line 2009, in _save
self.model.save_pretrained(output_dir, state_dict=state_dict)
File "/home/aniruddha/anaconda3/envs/ani/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1053, in save_pretrained
del state_dict[ignore_key]
KeyError: 'encoder\.embed_tokens\.weight'