Open RongchangLi opened 1 year ago
Sorry I did not explain that the ubuntu_train_vpt_vtab.sh is a file equals to slurm_train_vpt_vtab.sh.
I tried to set lr=1e-2, weight_decay=1e-4, the result improved to 71.21. If changing the number of prompts to 50 (maybe it's also used in your paper), the result is 73.57. But it is still much lower that 78.8 (reported in your paper).
So, could you please peovide the training details for VPT? It will help a lot~
I can use VPT official code and model imagenet21k_ViT-B_16.npz from https://github.com/jeonsworld/ViT-pytorch (the link from VPT codebse License) to reproduce the VPT results of cifar100. However, it seems that they replace backbone to ViT-B_16.npz from google, I'm confused why the results of using two types of ViT-B are so different?
Sorry I did not explain that the ubuntu_train_vpt_vtab.sh is a file equals to slurm_train_vpt_vtab.sh.
I tried to set lr=1e-2, weight_decay=1e-4, the result improved to 71.21. If changing the number of prompts to 50 (maybe it's also used in your paper), the result is 73.57. But it is still much lower that 78.8 (reported in your paper).
So, could you please peovide the training details for VPT? It will help a lot~
I met the same problem.
In the paper, the result of VPT on Cifar100 is 78.8. But I repeoduce a worse results: 64.39.
Here is my command:
bash configs/VPT/VTAB/ubuntu_train_vpt_vtab.sh experiments/VPT/ViT-B_prompt_vpt_100.yaml ViT-B_16.npz
lr is 0.001, weight decay is 0.0001.Here is the outputs:
Can you help to find the reasons?