Improved linear evaluation that achieves better results

LAION-AI / CLIP_benchmark

CLIP-like model evaluation

MIT License

623 stars 80 forks source link

Improved linear evaluation that achieves better results #107

Closed teasgen closed 12 months ago

teasgen commented 1 year ago

In the updated linear evaluation, the calculation process involves dividing the dataset into three parts: train, validation, and test. However, if the dataset does not already have a validation split, I will divide the train part into two sections based on the specified proportion. It means we will get more fair results. Also I've added regularization with openAI hyperparameter sweep (https://arxiv.org/pdf/2103.00020.pdf A.3). Now the results are more similar to openAI metrics for CLIP models (same paper, table 10)

e.g. ViT-L-14 openai model	Metric	Current	New	openAI
DTD dataset	80.1	82.1	82.1	-2.0
Country211	38.7	42.1	42.9	-1.4
Food101	93.4	95.3	95.2	-1.9
Aircraft	62.4	67.5	69.4	-4.1
Cifar100	84.8	87.3	87.5	-2.5
Cifar10	97.7	98.0	98.0	-0.3

Hyperparameters	Value
Batch size	512
Epochs	20
LR	0.1

Danil328 commented 1 year ago

Approve, please

mehdidc commented 1 year ago

Sorry for the delay an thank you very much for the PR @teasgen . I will have a look right after fixing #109

ankitkv commented 1 year ago

Hi @teasgen ! Do you happen to know the best setting to use your PR for linear probe on ImageNet?

teasgen commented 1 year ago

Hi @teasgen ! Do you happen to know the best setting to use your PR for linear probe on ImageNet?

Hi, unfortunately I haven't tested my PR on ImageNet. But you can efficiently find best hyperparameters using cli arguments. However, I used same setting for all datasets, so you can firstly try it

mehdidc commented 12 months ago

Hi @teasgen, working fine for me! the only thing that would be nice to keep is the default behavior, i.e. not specifying a validation dataset. Currently, it fails with an error message if validation set or validation proportion are not given. With this commit: https://github.com/LAION-AI/CLIP_benchmark/commit/396f8073f6c84ca230e4ecaa6d16db3e90a71d1c, I could make it working fine again, but I might have missed something. Could you please have a look/confirm ?

Thanks!

teasgen commented 12 months ago

Hi @teasgen, working fine for me! the only thing that would be nice to keep is the default behavior, i.e. not specifying a validation dataset. Currently, it fails with an error message if validation set or validation proportion are not given. With this commit: 396f807, I could make it working fine again, but I might have missed something. Could you please have a look/confirm ?

Thanks!

Hi! Your commit looks good, I suppose now its alright. Could you please release new version to pypi as soon as pr will be merged?

mehdidc commented 12 months ago

Great, thanks @teasgen! yes, sure, will release a new version on pypi!

mehdidc commented 12 months ago

Merging, will add the other commit right after.

mehdidc commented 12 months ago

@teasgen @Danil328 available now in 1.6.0, pip install clip_benchmark==1.6.0