vtab1k Dataset accuracy

111chengxuyuan commented 2 years ago

Hello I have done the vtab1k experiment on three datasets, but the experimental results are much different from the paper. The result of cifar100 dataset is 72.4, the result of smallnorb/azimuth dataset is 15.7, and the result of smallnorb/elevation dataset is 22.6. I don't know why. Is my profile wrong? My profile is as follows:

          NUM_GPUS: 1
          NUM_SHARDS: 1
          OUTPUT_DIR: ""
          RUN_N_TIMES: 1
          MODEL:
            TRANSFER_TYPE: "prompt"
            TYPE: "vit"
            LINEAR:
              MLP_SIZES: []
          SOLVER:
            SCHEDULER: "cosine"
            PATIENCE: 300
            LOSS: "softmax"
            OPTIMIZER: "sgd"
            MOMENTUM: 0.9
            WEIGHT_DECAY: 0.0001
            LOG_EVERY_N: 100
            WARMUP_EPOCH: 10
            TOTAL_EPOCH: 100
          DATA:
            NAME: "vtab-cifar(num_classes=100)"
            NUMBER_CLASSES: 100
            DATAPATH: "/home/vpt/dataset"
            FEATURE: "sup_vitb16_224"
            BATCH_SIZE: 128

KMnP commented 2 years ago

Hi, @111chengxuyuan , to debug, I wonder if you can use the one of the 3 vtab-datasets in the demo.ipynb, and see if the results are the same.

KMnP commented 2 years ago

After a quick check on the config, the feature is different: should be "sup_vitb16_imagenet21k" instead of "sup_vitb16_224".

But I still recommend using the 3 vtab datasets in demo as a starting point for reproducing our results.

111chengxuyuan commented 2 years ago

OK,thank you,I'll try

KMnP commented 2 years ago

let me know how it goes! btw, could you use the lr, wd, seed values specified in the demo as well?

111chengxuyuan commented 2 years ago

Yes, the demo can be used well, now the experimental results are the same as those in the paper. Thank you!

KMnP commented 2 years ago

Awesome! I'm glad to hear that! Will close the issue for now, feel free to reopen if you have other questions!

zhaoedf commented 2 years ago

After a quick check on the config, the feature is different: should be "sup_vitb16_imagenet21k" instead of "sup_vitb16_224".

But I still recommend using the 3 vtab datasets in demo as a starting point for reproducing our results.

what is the difference between these two models? using different datasets to pre-train?

the link provided in README, is it sup_vitb16_imagenet21k or sup_vitb16_224?

qianlanwyd commented 2 years ago

sup_vitb16_imagenet21k is not supported since you did not provide imagenet21k_ViT-B_16.npz. I can only set "sup_vitb16" as the pretrained model.

zhaoedf commented 2 years ago

sup_vitb16_imagenet21k is not supported since you did not provide imagenet21k_ViT-B_16.npz. I can only set "sup_vitb16" as the pretrained model.

can you be more specific? i don't quite understand you.

qianlanwyd commented 2 years ago

sup_vitb16_imagenet21k is not supported since you did not provide imagenet21k_ViT-B_16.npz. I can only set "sup_vitb16" as the pretrained model.

can you be more specific? i don't quite understand you.

I am asking the similar questions like yours.

qianlanwyd commented 2 years ago

I cannot run with the feature set to sup_vitb16_imagenet21k.

KMnP commented 2 years ago

@qianlanwyd @zhaoedf I see that there is some confusion over the feature names. The link provided in README is for ViT-Base pre-trained with ImageNet-21k, which is the main pre-trained model we used in the paper.

Due to legacy issue, I renamed the downloaded ckpt to imagenet21k_ViT-B_16.npz from ViT-B_16.npz. The new ckpt name is used for building the model. If you don't rename the checkpoint and set DATA.FEATURE = "sup_vitb16_imagenet21k", you will get a FileNotFound error.

There are two solutions: (1) rename the downloaded ckpt to imagenet21k_ViT-B_16.npz. (2) change L35 of src/models/build_vit_backbone.py to ViT-B_16.npz.

I recommend to use solution (1). I also added a note in README about this.

KMnP / vpt

vtab1k Dataset accuracy #2