barisozmen / deepaugment

Discover augmentation strategies tailored for your dataset
MIT License
247 stars 41 forks source link

'Fast AutoAugment' by my team, KakaoBrain #25

Open ildoonet opened 5 years ago

ildoonet commented 5 years ago

30x-250x efficient method to find augmentation policies automaticallty, relative to AutoAugment by Google.

Arxiv : https://arxiv.org/abs/1905.00397 Code : https://github.com/KakaoBrain/fast-autoaugment

We don't train child-networks like AutoAugment or deepaugment, and that is the key reason of the speed. But I really appreciate your work and I hope we can influence each other, in a good way. I also want to make my repo easy to use like yours.

lichuanx commented 5 years ago

We don't train child-netwo

Hey, nice and interesting work. A question: if you don't searching a policy from a child networks and transfer to a big networks. The time using on training big networks such as inception-resnet_V2, will be a very slow. How to tackle this. I have read your paper, but I can't figure out the answer, thank you.

ildoonet commented 5 years ago
  1. We do search policies using small networks. We can use proxy network(small network) to search policies and then transfer the policies to the big ent. In our settings, experiment with Imagenet was conducted using small subset of dataset and resnet50 as a proxy network.

  2. But we don't evaluate a policy by training child networks. We train a network just for one time and our algorithm evaluate many policies without tranining furthermore.

lichuanx commented 5 years ago
  1. We do search policies using small networks. We can use proxy network(small network) to search policies and then transfer the policies to the big ent. In our settings, experiment with Imagenet was conducted using small subset of dataset and resnet50 as a proxy network.
  2. But we don't evaluate a policy by training child networks. We train a network just for one time and our algorithm evaluate many policies without tranining furthermore.

Oh, thank you for your reply. That's very interesting idea. Trying to figure it out in your paper that "minimize distance between density of DM and density of DA" Augmentation is meant to let your train set match the distribution of val/test set. How's that searching a policy that change val set to match train set works. As for as I can concern, it is plays as some sort of regularization, that let's all your train set represent some "strong features" learned on your theta. Still little bit confused.