Problems in understanding this paper

JunYeopLee / fast-autoaugment-efficientnet-pytorch

A Pytorch implementation of Fast AutoAugment and EfficientNet

119 stars 13 forks source link

Problems in understanding this paper #2

Closed wang3702 closed 4 years ago

wang3702 commented 4 years ago

First, thanks a lot for your reimplementation for the fast-autoaugment. It really helped me a lot. But after carefully checking your code, I need to point out your problem in understanding this paper. Actually, the stratified shuffling will make the search part to have K models instead of one model. That's to say, for each split of k-fold, you will have a model and have corresponding best policy and then you combine the top-n policy from 5 fold. However, in your implementation, you only have one model. That's not good. But you achieved good performance which in turn prove the fast-autoaugment is a useful method.

JunYeopLee commented 4 years ago

Hi wang, Thaks for your careful review.

However, in my knowledge, it is right that there is only one trained model in fast-autoaugment algorithm. The key to the algorithm is to use "inference time augmentation" to quickly find augmentation policies without training multiple models. That point is what differs from original autoaugment.

I may have misunderstood. If my explanation is wrong, please point it out.

Thank you.

wang3702 commented 4 years ago

Hi Lee. Thanks for your quick reply! Here we do not need child model to explore and exploit. However, we do need to collect more policy in this situation. Therefore, we use stratified shuffling to get k-fold cross validation dataset. Thus, we can finally collect K different top-N policy from each model. Please notice in the loop the model is not one model but K models. The reason I think you are incorrect is that in stratified function in sklearn, you will have k trainset-validset pair if you specify the K=5. However, in your implementation, the stratified is not properly used as user suggested. Please check this to help you better understand: https://github.com/junkwhinger/fastautoaugment_jsh/blob/master/search_fastautoaugment.py

wang3702 commented 4 years ago

I checked your code again and found you actually did child model to get K=5 different models. And then use the policy from them to train the final model. You are correct.