mlfoundations / wise-ft

Robust fine-tuning of zero-shot models
https://arxiv.org/abs/2109.01903
Other
654 stars 67 forks source link

Where to find pre-trained model weights #13

Open moon001light opened 1 year ago

moon001light commented 1 year ago

Hi newbie here, I am trying to fine tune this model of yours which was uploaded to huggingface: https://huggingface.co/laion/CLIP-ViT-L-14-laion2B-s32B-b82K

I want to fine-tune it on my custom dataset.

Looking from the example below, the "checkpoint" to load are of .pt. May I ask where can I find these checkpoints for the pre-trained model specified in the link?

python src/wise_ft.py   \
    --eval-datasets=ImageNet,ImageNetV2,ImageNetR,ImageNetA,ImageNetSketch  \
    --load=models/zeroshot.pt,models/finetuned.pt  \
    --results-db=results.jsonl  \
    --save=models/wiseft  \
    --data-location=~/data \
    --alpha 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Side question: why do I need to pass the finetuned.pt checkpoints for fine tuning? Won't I be missing the fine-tune weights before I start fine-tuning on my custom dataset?

gabrielilharco commented 1 year ago

Hey @moon001light. The command you posted is meant for running interpolation provided you already have both the zero-shot and fine-tuned checkpoints (hence you need to pass both paths). If you're trying to fine-tune the model yourself (instead of loading an existing fine-tuned checkpoint), your command should look more like the second part of this doc https://github.com/mlfoundations/wise-ft/tree/master#run-wise-ft, as below

python src/wise_ft.py   \
    --train-dataset=ImageNet  \
    --epochs=10  \
    --lr=0.00003  \
    --batch-size=512  \
    --cache-dir=cache  \
    --model=ViT-B/32  \
    --eval-datasets=ImageNet,ImageNetV2,ImageNetR,ImageNetA,ImageNetSketch  \
    --template=openai_imagenet_template  \
    --results-db=results.jsonl  \
    --save=models/wiseft/ViTB32  \
    --data-location=~/data \
    --alpha 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Note that you only specify the model here with ViT-B/32. Right now the codebase doesn't directly support other checkpoints from open_clip, but this should be easy to add if you change this code https://github.com/mlfoundations/wise-ft/blob/master/src/models/modeling.py#LL13C19-L14C48 to something like

self.model, self.train_preprocess, self.val_preprocess = open_clip.create_model_and_transforms('ViT-L-14', pretrained='laion2b_s32b_b82k')

Hope this helps!

PS: we have a better ViT-L/14 model now which you might want to check out, datacomp_xl_s13b_b90k, which gets 79.2% zero-shot acc. on ImageNet

moon001light commented 1 year ago

Thank you for the quick response @gabrielilharco. May I ask that since I am using a custom dataset (I just have images with labels), how should I structure the data/folder so that I can input it into the train-dataset and eval-datasets params?

gabrielilharco commented 1 year ago

You'll need to write some code for that to create the dataloader either way, so most folder structures will do as long as you write the correct code to load from it. A typical one is to have one subfolder per class that stores all images from that class. There are many examples here, e.g. https://github.com/mlfoundations/wise-ft/blob/master/src/datasets/imagenet.py#L8

xiyangyang99 commented 1 year ago

@gabrielilharco Thank you for your guidance. When I was looking at the open source project of Wise ft, I found that there is no imagenet2_ Pytorch in the file Wise ft/src/datasets/imagenetv2. py this File? Or need to pip install?

gabrielilharco commented 1 year ago

Yes, you can pip install git+https://github.com/modestyachts/ImageNetV2_pytorch