Open nuaajeff opened 2 years ago
Hi @nuaajeff, thanks for your question. The difference in policy may be caused by some differences in policy search configurations. As the policy is learned in a data-driven way, there can be multiple versions of an optimal data augmentation strategy. I noticed that your learned policy also shows larger probabilities in flipping 0 and 1, which is a good sign.
For the end performance, I am running some experiments to double check the results. Meanwhile, you may also tune the temperature, number of operators and diversity parameters to better adapt the learned policy to the target dataset.
Hi Tsz-Him, thanks for your work. I run the code following the guidance under the same environment on reduced SVHN. However, I found the searched strategy is quite different from the strategy in paper. It can be seen that the probability of rotation operation is large on 4, 7, 9 etc. And the probability of autocontrast is also larger than the strategy in the original paper. The final accuracy achieved by the searched strategy is 90.86%, which means the error rate is 9.14%, different from the reported error rate 8.2% for near 1%. Do you have any idea about how to reproduce the original results correctly? Maybe some hyperparameters in this repo are different from the paper? Thanks for your help!