Getting error while loading checkpoints

EnesAltinisik commented 2 years ago

I tried to reproduce the results of DAT with the code present in the Readme.md. However, when I try to load checkpoints, I always get this error: PytorchStreamReader failed reading zip archive: failed finding central directory

I tried different torch versions to understand if it is related to torch versions, however, the problem still persists.

Can you share more detail about your environment?

vtddggg commented 2 years ago

Sorry for the inconvenient.

We have provided a DockerFile which contains our running environment.

By the way, can you provide some more information about which checkpoints you want to load and which script you run. Such that we can reproduce the error to check if it is a bug need to be fixed.

Thanks!!

EnesAltinisik commented 2 years ago

Because of my working environment, I can not run the docker image. (I tried the same configuration with conda)

I got errors for all checkpoints in the adversarial training benchmark:

Creating model: vit_base_patch16_224 Traceback (most recent call last): File "benchmarks/benchmark.py", line 137, in main() File "benchmarks/benchmark.py", line 84, in main ckpt = model_zoo.load_url(args.ckpt_path) File ".............../anaconda3/envs/easyRob/lib/python3.7/site-packages/torch/hub.py", line 528, in load_state_dict_from_url return torch.load(cached_file, map_location=map_location) File ".............../anaconda3/envs/easyRob/lib/python3.7/site-packages/torch/serialization.py", line 585, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "................../anaconda3/envs/easyRob/lib/python3.7/site-packages/torch/serialization.py", line 242, in init super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory

vtddggg commented 2 years ago

I run the follow command:

python benchmarks/benchmark.py --adv --model vit_base_patch16_224 --ckpt_path http://alisec-competition.oss-cn-shanghai.aliyuncs.com/xiaofeng/imagenet_pretrained_models/advtrain_models/advtrain_vit_base_patch16_224_ep4.pth

and the weight is loaded successfully for me.

Here is the possible solution for your case. Could you please run md5sum advtrain_vit_base_patch16_224_ep4.pth to check if the weight is fully download? For correct checkpoint, the md5 value should be 3da2ce407c9e84b6c3acd93eccb1f592

EnesAltinisik commented 2 years ago

Thank you for your answer. You are right. For one reason, models were downloaded as corrupted. I re-download files, and the problem was solved. Sorry for bothering you.

alibaba / easyrobust

Getting error while loading checkpoints #3