Closed mgpadalkar closed 3 years ago
I think it makes sense in a fork or a branch, but not in the official implementation which goes with the paper. It could be confusing.
That being said, that is a smart change for the rest of us without a cluster of GPUs. :)
Hi @mgpadalkar That is a very good point and actually this is very similar to what we propose with the logistic regression evaluation on frozen pre-computed features with cyanure. You can check table 10 of our paper and the following issue https://github.com/facebookresearch/dino/issues/121 for guidance on how to use cyanure.
Hi @mathildecaron31 Thanks for pointing to it. Somehow I had missed this part in appendix A of the paper :see_no_evil:.
By the way, in eval_knn.py
and also as you mention in https://github.com/facebookresearch/dino/issues/121#issuecomment-921984790, the L2 normalized features are used while this doesn't seem to be the case in eval_linear.py
. But for sure the logistic regression evaluation is useful :+1:.
I think it makes sense in a fork or a branch, but not in the official implementation which goes with the paper. It could be confusing. That being said, that is a smart change for the rest of us without a cluster of GPUs. :)
Hi @woctezuma
I have it here if someone needs it. Just need to pass --no_aug
for no data augmentation.
@mgpadalkar I was wondering if the major performance improvement is due to excluding augmentation or due to primarily extracting features first and then doing linear classification ?
Running 100 epochs of
eval_linear.py
takes quite a long time as the frozen features are extracted in every epoch. I understand that this is needed to augment data for training thefc
layer.Without augmentation (similar to what is done in
eval_knn.py
) one would compromise a bit on the accuracy but can save a lot of time and resources. This can provide a quick estimate of the accuracy when running several experiments to obtain better features. Once an approach for feature extraction is finalized, augmented data can be used to get the real accuracy.If the above sounds reasonable, it would be nice to have such an option when passing the arguments.
I tried this approach with
--arch resnet50 --lr 0.03 --epochs 300
and was able to get the following results in about 90min on a single GPU machine.As we can see, the accuracy is a bit lower (than 75.3% that is reported here) but saves a lot of time.