LAION-AI / CLIP_benchmark

CLIP-like model evaluation
MIT License
590 stars 75 forks source link

Improved linear evaluation that achieves better results #107

Closed teasgen closed 10 months ago

teasgen commented 1 year ago

In the updated linear evaluation, the calculation process involves dividing the dataset into three parts: train, validation, and test. However, if the dataset does not already have a validation split, I will divide the train part into two sections based on the specified proportion. It means we will get more fair results. Also I've added regularization with openAI hyperparameter sweep (https://arxiv.org/pdf/2103.00020.pdf A.3). Now the results are more similar to openAI metrics for CLIP models (same paper, table 10)

e.g. ViT-L-14 openai model Metric Current New openAI diff decrease
DTD dataset 80.1 82.1 82.1 -2.0
Country211 38.7 42.1 42.9 -1.4
Food101 93.4 95.3 95.2 -1.9
Aircraft 62.4 67.5 69.4 -4.1
Cifar100 84.8 87.3 87.5 -2.5
Cifar10 97.7 98.0 98.0 -0.3
Hyperparameters Value
Batch size 512
Epochs 20
LR 0.1
Danil328 commented 11 months ago

Approve, please

mehdidc commented 11 months ago

Sorry for the delay an thank you very much for the PR @teasgen . I will have a look right after fixing #109

ankitkv commented 10 months ago

Hi @teasgen ! Do you happen to know the best setting to use your PR for linear probe on ImageNet?

teasgen commented 10 months ago

Hi @teasgen ! Do you happen to know the best setting to use your PR for linear probe on ImageNet?

Hi, unfortunately I haven't tested my PR on ImageNet. But you can efficiently find best hyperparameters using cli arguments. However, I used same setting for all datasets, so you can firstly try it

mehdidc commented 10 months ago

Hi @teasgen, working fine for me! the only thing that would be nice to keep is the default behavior, i.e. not specifying a validation dataset. Currently, it fails with an error message if validation set or validation proportion are not given. With this commit: https://github.com/LAION-AI/CLIP_benchmark/commit/396f8073f6c84ca230e4ecaa6d16db3e90a71d1c, I could make it working fine again, but I might have missed something. Could you please have a look/confirm ?

Thanks!

teasgen commented 10 months ago

Hi @teasgen, working fine for me! the only thing that would be nice to keep is the default behavior, i.e. not specifying a validation dataset. Currently, it fails with an error message if validation set or validation proportion are not given. With this commit: 396f807, I could make it working fine again, but I might have missed something. Could you please have a look/confirm ?

Thanks!

Hi! Your commit looks good, I suppose now its alright. Could you please release new version to pypi as soon as pr will be merged?

mehdidc commented 10 months ago

Great, thanks @teasgen! yes, sure, will release a new version on pypi!

mehdidc commented 10 months ago

Merging, will add the other commit right after.

mehdidc commented 10 months ago

@teasgen @Danil328 available now in 1.6.0, pip install clip_benchmark==1.6.0