mlfoundations / wise-ft

Robust fine-tuning of zero-shot models
https://arxiv.org/abs/2109.01903
Other
654 stars 67 forks source link

About fine-tuning #20

Closed ariawoo closed 1 year ago

ariawoo commented 1 year ago

Hi, good work here. I am following the steps trying to get the clip fine-tuned. So I downloaded two datasets that were used in your example and simplified the script to like this:

python src/wise_ft.py \ --train-dataset=ImageNetR \ --epochs=10 \ --lr=0.00003 \ --batch-size=32 \ --cache-dir=cache \ --model=ViT-B/32 \ --eval-datasets=ImageNetR,ImageNetA \ --template=openai_imagenet_template \ --results-db=results.jsonl \ --save=models/wiseft/ViTB32 \ --data-location=~/data \ --alpha 0 0.5 0.9

And then I got the following error. I have checked the code, and I found there is no such method as train_loader. Is that because there are some updates from the code? Or? Can you please give me some hints? Thanks.

Traceback (most recent call last): File "/Users/happymind/local_dev/wise-ft/src/wise_ft.py", line 104, in wise_ft(args) File "/Users/happymind/local_dev/wise-ft/src/wise_ft.py", line 61, in wise_ft finetuned_checkpoint = finetune(args) ^^^^^^^^^^^^^^ File "/Users/happymind/local_dev/wise-ft/src/models/finetune.py", line 50, in finetune num_batches = len(dataset.train_loader) ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'ImageNetR' object has no attribute 'train_loader'. Did you mean: 'test_loader'?

hzpro1221 commented 1 year ago

It's seem like there is a problem with "--train-dataset=ImageNetR". ImageNetR's code is design to used ImageNetR as evaluate-dataset, not train-dataset

ariawoo commented 1 year ago

Thank you @hzpro1221. I have 2 follow-up questions here.

  1. I went to the repo that shows us how to download the dataset ( https://github.com/huggingface/transformers/tree/main/examples/pytorch/contrastive-image-text), but I didn't find the script to guide us on how to download ImageNet dataset. If you know how, can you please give some guidance? Thanks

  2. I pasted the original script here, and you might notice that ImageNet was used both in the training part and the evaluation part. That's why I thought maybe ImageNetR could do the same. But I might be wrong. I found there are some arguments that are not in their code base anymore, like the flag --freeze-encoder. Please correct me if I am wrong. Very appreciated.

python src/wise_ft.py \ --train-dataset=ImageNet \ --epochs=10 \ --lr=0.00003 \ --batch-size=512 \ --cache-dir=cache \ --model=ViT-B/32 \ --eval-datasets=ImageNet,ImageNetV2,ImageNetR,ImageNetA,ImageNetSketch \ --template=openai_imagenet_template \ --results-db=results.jsonl \ --save=models/wiseft/ViTB32 \ --data-location=~/data \ --alpha 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

gabrielilharco commented 1 year ago

Hi @Coffeyliu

  1. See https://www.image-net.org/download.php
  2. As @hzpro1221 said, ImageNet-R doesn't have a training set. In contrast, ImageNet has both a training set and an evaluation set.