LAION-AI / CLIP_benchmark

CLIP-like model evaluation
MIT License
623 stars 80 forks source link

Improved linear evaluation that achieves better results #107

Closed teasgen closed 12 months ago

teasgen commented 1 year ago

In the updated linear evaluation, the calculation process involves dividing the dataset into three parts: train, validation, and test. However, if the dataset does not already have a validation split, I will divide the train part into two sections based on the specified proportion. It means we will get more fair results. Also I've added regularization with openAI hyperparameter sweep (https://arxiv.org/pdf/2103.00020.pdf A.3). Now the results are more similar to openAI metrics for CLIP models (same paper, table 10)

e.g. ViT-L-14 openai model Metric Current New openAI diff decrease
DTD dataset 80.1 82.1 82.1 -2.0
Country211 38.7 42.1 42.9 -1.4
Food101 93.4 95.3 95.2 -1.9
Aircraft 62.4 67.5 69.4 -4.1
Cifar100 84.8 87.3 87.5 -2.5
Cifar10 97.7 98.0 98.0 -0.3
Hyperparameters Value
Batch size 512
Epochs 20
LR 0.1
Danil328 commented 1 year ago

Approve, please

mehdidc commented 1 year ago

Sorry for the delay an thank you very much for the PR @teasgen . I will have a look right after fixing #109

ankitkv commented 1 year ago

Hi @teasgen ! Do you happen to know the best setting to use your PR for linear probe on ImageNet?

teasgen commented 1 year ago

Hi @teasgen ! Do you happen to know the best setting to use your PR for linear probe on ImageNet?

Hi, unfortunately I haven't tested my PR on ImageNet. But you can efficiently find best hyperparameters using cli arguments. However, I used same setting for all datasets, so you can firstly try it

mehdidc commented 12 months ago

Hi @teasgen, working fine for me! the only thing that would be nice to keep is the default behavior, i.e. not specifying a validation dataset. Currently, it fails with an error message if validation set or validation proportion are not given. With this commit: https://github.com/LAION-AI/CLIP_benchmark/commit/396f8073f6c84ca230e4ecaa6d16db3e90a71d1c, I could make it working fine again, but I might have missed something. Could you please have a look/confirm ?

Thanks!

teasgen commented 12 months ago

Hi @teasgen, working fine for me! the only thing that would be nice to keep is the default behavior, i.e. not specifying a validation dataset. Currently, it fails with an error message if validation set or validation proportion are not given. With this commit: 396f807, I could make it working fine again, but I might have missed something. Could you please have a look/confirm ?

Thanks!

Hi! Your commit looks good, I suppose now its alright. Could you please release new version to pypi as soon as pr will be merged?

mehdidc commented 12 months ago

Great, thanks @teasgen! yes, sure, will release a new version on pypi!

mehdidc commented 12 months ago

Merging, will add the other commit right after.

mehdidc commented 12 months ago

@teasgen @Danil328 available now in 1.6.0, pip install clip_benchmark==1.6.0