falcon-xu / LGViT

Official PyTorch implementation of "LGViT: Dynamic Early Exiting for Accelerating Vision Transformer" (ACM MM 2023)
https://arxiv.org/abs/2308.00255
MIT License
5 stars 2 forks source link

Issues loading the pretrained model #3

Open ionymikler opened 1 week ago

ionymikler commented 1 week ago

Hi @falcon-xu !

I am trying to run the eval_highway_deit.sh script using the model you published in this HF repo, but running into some issues.

Originally, your eval_highway_deit.sh script has the following lines:

python -m torch.distributed.launch --nproc_per_node=3 --master_port=29566 --nnodes=1 ../examples/run_highway.py \
... \
    --model_name_or_path /xxx/path/to/Early_Exit/checkpoint_path \

I thought the mode_name_or_path should point to the model.bin that comes from the HF repo, but I get the following error (snippet here but uploaded a txt with all the error traceback):

Traceback (most recent call last):
  File "/zhome/57/8/181461/thesis/lgvit/lgvit_repo/examples/run_highway_deit.py", line 929, in <module>
    main()
  File "/zhome/57/8/181461/thesis/lgvit/lgvit_repo/examples/run_highway_deit.py", line 795, in main
    config = DeiTConfig.from_pretrained(
  File "/work3/s222962/miniconda3/envs/lgvit/lib/python3.9/site-packages/transformers/configuration_utils.py", line 538, in from_pretrained
    config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
...
OSError: It looks like the config file at '/zhome/57/8/181461/thesis/lgvit/LGViT-ViT-Cifar100/pytorch_model.bin' is not a valid JSON file.

Looks like the it is asking for a config file, but I'm unsure what that means. I tried providing then the config.json from the same HF model repo, but also didn't work. This time it gave this error:

OSError: Unable to load weights from pytorch checkpoint file for '/zhome/57/8/181461/thesis/lgvit/LGViT-ViT-Cifar100/config.json'

I guess my question is really how to provide the script with the pretrained model?

Also, what are some scripts named 'highway', what is the meaning of it?

Thanks for your help!

falcon-xu commented 1 week ago

Hi, @ionymikler, It might be the the way of loading the pretrained ckpt. The --model_name_or_path parameter should point to the directory containing the pretrained model weights. This directory must include at least the following files:

Thus, please ensure that --model_name_or_path points to the complete model folder downloaded from the Hugging Face repository, not to a single file path.