tensorflow / tpu

Reference models and tools for Cloud TPUs.
https://cloud.google.com/tpu/
Apache License 2.0
5.21k stars 1.77k forks source link

Questions can't be found in papers about randaugment. #637

Open minjieharmo opened 4 years ago

minjieharmo commented 4 years ago

hello, I have 3 questions can't be found in papers.

1.How to decide number of pixels for translateX(Y), on cifar10 is 10 pixels(10/32) but 150 pixels for imagenet(150/331), for example, if image size is (224,224), how to decide the number pixels.

2.If Randaugment used the same range of magnitudes as Autoaugment, or used 3 times of magnitudes in pictures bellow? image

3.M is 0 to 30. For some operations M=0 is original image(Shear Translate Rotate...), but for other operations(Brightness Color Contrast) M=15 is original image. Are you intended to do that for some reasons?

rwightman commented 4 years ago

@minjieharmo I've had some similar questions about RandAugment in #614

I have a version of AA and RA working in PyTorch (https://github.com/rwightman/pytorch-image-models/blob/master/timm/data/auto_augment.py). I've made my implementation for RA match AA with an M range of 0-10. Assuming that was the intended maximum for the implementation in this repository.

For the translation, I switched my implementation to use a relative translate with a percentage as the constant instead of 150. I've been using .45 (150/331) and it seems to work well with 224 and a EfficientNet-B2 and B3 (260 and 300). I've noted that the AugMix paper and implementation uses (.333) as the percentage for their translate. I'm currently working on bringing AugMix into the same file as my AA and RA implementation.

I also found it odd that the direction of 'increasing' severity for the magnitude differs across operations. This is the most troubling for me, as it makes it fuzzy as to what the difference between magnitudes is exactly. This didn't matter in AA as the magnitudes were searched per op, but seems undesirable for RandAugment where you'd expect M0 to be no (or minimal) change and M10 to be maximum across all ops. It's worth noting that the Google team who did AugMix made those operations more consistent in terms of tying the magnitude to increasing severity.

JoinWei-PKU commented 4 years ago

@BarretZoph I have similar questions as above. Could you please answer our questions? Thanks

BarretZoph commented 4 years ago

Hi Everybody thank you for the questions/comments.

  1. "How to decide number of pixels for translateX(Y), on cifar10 is 10 pixels(10/32) but 150 pixels for imagenet(150/331), for example, if image size is (224,224), how to decide the number pixels." We used a heuristic of making the maximum (when magnitude is 10) image_size/3. This hyperparameter we found to be somewhat insensitive between the ranges of image_size/3 to image_size/2.

  2. "If Randaugment used the same range of magnitudes as Autoaugment, or used 3 times of magnitudes in pictures bellow?" RandAugment magnitudes still use the same range in some sense. In AutoAugment the magnitudes can only be between 0 and 10, while in RandAugment we experiment with magnitudes up to around 30. The scaling for the magnitudes is still the same in the two papers, it is just RandAugment experiments with higher magnitudes.

3 "M is 0 to 30. For some operations M=0 is original image(Shear Translate Rotate...), but for other operations(Brightness Color Contrast) M=15 is original image. Are you intended to do that for some reasons?" No we actually had in mind making all operations be "more extreme" as the magnitude increased, but with a few ops this was not the case (e.g. posterize). We have since fixed this and have not noticed a big difference in performance.

JoinWei-PKU commented 4 years ago

@BarretZoph Hi, Thanks for your answer! For Answer 2, I don't understand actually. In TPU code, the max_level=10, but it can extend the final rand by magnitude/max_level, e.g., the value is 28/10 x original_range. Or your answer means that, you set max_level=30, thus the value is 28/30 x range. Therefore, which is correct? The TPU code with 28/10 x original_range, or 28/30 x original_range? Thanks.

JoinWei-PKU commented 4 years ago

Moreover, In original AA paper, the cutout operation is not in search candidate. But in randAA TPU code, cutout is added. can this operation further improve the performance?

BarretZoph commented 4 years ago

Hi it would be "The TPU code with 28/10 x original_range.". We found that adding Cutout to RandAugment under certain scenarios to improve performance, so we added it!