facebookresearch / dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
Apache License 2.0
6.36k stars 908 forks source link

Speeding-up `eval_linear.py` skipping augmentation? #141

Closed mgpadalkar closed 3 years ago

mgpadalkar commented 3 years ago

Running 100 epochs of eval_linear.py takes quite a long time as the frozen features are extracted in every epoch. I understand that this is needed to augment data for training the fc layer.

Without augmentation (similar to what is done in eval_knn.py) one would compromise a bit on the accuracy but can save a lot of time and resources. This can provide a quick estimate of the accuracy when running several experiments to obtain better features. Once an approach for feature extraction is finalized, augmented data can be used to get the real accuracy.

If the above sounds reasonable, it would be nice to have such an option when passing the arguments.

I tried this approach with --arch resnet50 --lr 0.03 --epochs 300 and was able to get the following results in about 90min on a single GPU machine.

* Acc@1 74.514 Acc@5 92.290 loss 1.027
Accuracy at epoch 299 of the network on the 50000 test images: 74.5%
Max accuracy so far: 74.56%
Training of the supervised linear classifier on frozen features completed.
Top-1 test accuracy: 74.6

As we can see, the accuracy is a bit lower (than 75.3% that is reported here) but saves a lot of time.

woctezuma commented 3 years ago

I think it makes sense in a fork or a branch, but not in the official implementation which goes with the paper. It could be confusing.

That being said, that is a smart change for the rest of us without a cluster of GPUs. :)

mathildecaron31 commented 3 years ago

Hi @mgpadalkar That is a very good point and actually this is very similar to what we propose with the logistic regression evaluation on frozen pre-computed features with cyanure. You can check table 10 of our paper and the following issue https://github.com/facebookresearch/dino/issues/121 for guidance on how to use cyanure.

mgpadalkar commented 3 years ago

Hi @mathildecaron31 Thanks for pointing to it. Somehow I had missed this part in appendix A of the paper :see_no_evil:.

By the way, in eval_knn.py and also as you mention in https://github.com/facebookresearch/dino/issues/121#issuecomment-921984790, the L2 normalized features are used while this doesn't seem to be the case in eval_linear.py. But for sure the logistic regression evaluation is useful :+1:.

I think it makes sense in a fork or a branch, but not in the official implementation which goes with the paper. It could be confusing. That being said, that is a smart change for the rest of us without a cluster of GPUs. :)

Hi @woctezuma I have it here if someone needs it. Just need to pass --no_aug for no data augmentation.

rbareja25 commented 5 months ago

@mgpadalkar I was wondering if the major performance improvement is due to excluding augmentation or due to primarily extracting features first and then doing linear classification ?