20:54:53 | building dictionary first...
20:54:53 | No model with opt yet at: /tmp/test_train_90M(.opt)
20:54:53 | Using CUDA
20:54:53 | loading dictionary from .../ParlAI/data/models/tutorial_transformer_generator/model.dict
20:54:53 | num words = 23928
20:54:53 | DEPRECATED: XLM should only be used for backwards compatibility, as it involves a less-stable layernorm operation.
20:54:55 | Total parameters: 71,628,800 (71,628,800 trainable)
20:54:55 | Loading existing model params from .../ParlAI/data/models/tutorial_transformer_generator/model
Traceback (most recent call last):
File "/home/ilya/miniconda3/envs/dl/bin/parlai", line 33, in <module>
sys.exit(load_entry_point('parlai', 'console_scripts', 'parlai')())
File "/home/ilya/repos/ParlAI/parlai/__main__.py", line 14, in main
superscript_main()
File "/home/ilya/repos/ParlAI/parlai/core/script.py", line 325, in superscript_main
return SCRIPT_REGISTRY[cmd].klass._run_from_parser_and_opt(opt, parser)
File "/home/ilya/repos/ParlAI/parlai/core/script.py", line 108, in _run_from_parser_and_opt
return script.run()
File "/home/ilya/repos/ParlAI/parlai/scripts/train_model.py", line 997, in run
self.train_loop = TrainLoop(self.opt)
File "/home/ilya/repos/ParlAI/parlai/scripts/train_model.py", line 353, in __init__
self.agent = create_agent(opt)
File "/home/ilya/repos/ParlAI/parlai/core/agents.py", line 479, in create_agent
model = model_class(opt)
File "/home/ilya/repos/ParlAI/parlai/core/torch_generator_agent.py", line 516, in __init__
states = self.load(init_model)
File "/home/ilya/repos/ParlAI/parlai/core/torch_agent.py", line 2074, in load
states = torch.load(
File "/home/ilya/miniconda3/envs/dl/lib/python3.8/site-packages/torch/serialization.py", line 608, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/ilya/miniconda3/envs/dl/lib/python3.8/site-packages/torch/serialization.py", line 794, in _legacy_load
deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
RuntimeError: storage has wrong size: expected 17592188352039 got 1048576
Bug description Experiencing RuntimeError when finetuning model according to command described in Recipes
Reproduction steps
parlai train_model -t blended_skill_talk -m transformer/generator --init-model zoo:tutorial_transformer_generator/model --dict-file zoo:tutorial_transformer_generator/model.dict --embedding-size 512 --n-layers 8 --ffn-size 2048 --dropout 0.1 --n-heads 16 --learn-positional-embeddings True --n-positions 512 --variant xlm --activation gelu --fp16 True --text-truncate 512 --label-truncate 128 --dict-tokenizer bpe --dict-lower True -lr 1e-06 --optimizer adamax --lr-scheduler reduceonplateau --gradient-clip 0.1 -veps 0.25 --betas 0.9,0.999 --update-freq 1 --attention-dropout 0.0 --relu-dropout 0.0 --skip-generation True -vp 15 -stim 60 -vme 20000 -bs 16 -vmt ppl -vmm min --save-after-valid True --model-file /tmp/test_train_90M
Expected behavior Start of model finetuning
Logs
Additional context