LuoweiZhou / VLP

Vision-Language Pre-training for Image Captioning and Question Answering
Apache License 2.0
411 stars 62 forks source link

Not able to load pre-trained model on CC dataset #35

Closed chenyez closed 3 years ago

chenyez commented 3 years ago

Hi Luowei,

Thank you so much for sharing the pre-trained model and with such detailed instructions.

I followed the instructions to install VLP, I am using python3.6, pytorch 1.1.0. And I also downloaded the checkpoint file pre-trained on CC dataset (cc_g8_lr1e-4_batch512_s0.75_b0.25.tar.gz).

However, when I load the model in the code (line 344 in run_img2txt_dist.py): model_recover = torch.load(args.model_recover_path)

I received the following error: Traceback (most recent call last): File "vlp/run_img2txt_dist.py", line 629, in main() File "vlp/run_img2txt_dist.py", line 362, in main model_recover = torch.load(args.model_recover_path,encoding='latin1') File "/home/ubuntu/anaconda3/envs/vlp/lib/python3.6/site-packages/torch/serialization.py", line 387, in load return _load(f, map_location, pickle_module, pickle_load_args) File "/home/ubuntu/anaconda3/envs/vlp/lib/python3.6/site-packages/torch/serialization.py", line 564, in _load magic_number = pickle_module.load(f, pickle_load_args) _pickle.UnpicklingError: invalid load key, '\x1f'.

I tried to load the file(cc_g8_lr1e-4_batch512_s0.75_b0.25.tar.gz) using encoding="latin1" but still the error persists.

Could you please let me know where might be the error? Thank you!

Regards,

Chenye

chenyez commented 3 years ago

And I also tried to first gunzip cc_g8_lr1e-4_batch512_s0.75_b0.25.tar.gz and load: gunzip cc_g8_lr1e-4_batch512_s0.75_b0.25.tar.

I received the following error:

Traceback (most recent call last): File "vlp/run_img2txt_dist.py", line 637, in main() File "vlp/run_img2txt_dist.py", line 370, in main model = torch.load(args.model_recover_path, map_location=lambda storage, loc: storage, pickle_module=pickle) File "/home/ubuntu/anaconda3/envs/vlp/lib/python3.6/site-packages/torch/serialization.py", line 387, in load return _load(f, map_location, pickle_module, **pickle_load_args) File "/home/ubuntu/anaconda3/envs/vlp/lib/python3.6/site-packages/torch/serialization.py", line 556, in _load return legacy_load(f) File "/home/ubuntu/anaconda3/envs/vlp/lib/python3.6/site-packages/torch/serialization.py", line 470, in legacy_load tar.extract('storages', path=tmpdir) File "/home/ubuntu/anaconda3/envs/vlp/lib/python3.6/tarfile.py", line 2041, in extract tarinfo = self.getmember(member) File "/home/ubuntu/anaconda3/envs/vlp/lib/python3.6/tarfile.py", line 1752, in getmember raise KeyError("filename %r not found" % name) KeyError: "filename 'storages' not found"

Some discussions on similar issues online say that this is because the file is corrupted. Please correct me if I am wrong, looking forward to your reply, thank you!

chenyez commented 3 years ago

Stupid question, I need to extract the file first... Sorry.