reproducibility-challenge / iclr_2019

ICLR Reproducibility Challenge 2019
https://reproducibility-challenge.github.io/iclr_2019/
219 stars 40 forks source link

Submission for Issue #120 #151

Open yashkant opened 5 years ago

yashkant commented 5 years ago

120

Participant Information

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 8 Reviewer 1 comment : This work tries to reproduce Partially Adam on Cifar100 and Cifar10 with VGG, (Wide-)ResNet. The experiments show that PAdam achieves better results. Discussion on future study is appreciated. The writing of this paper can be improved. For example, the figure of network architecture in page 4 is not necessary since it is not relevant to main topic (optimizer). Figure 2 is ill-formatted. Confidence : 3

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 4 Reviewer 3 comment : The paper provides a good summary of the main contributions of the original work. The goal is to do a direction reproduction of the tables and graphs in the paper, using independent code. The proposed method is quite simple to implement, and the reproducibility builds on a common implementation of Adam as a starting point. The authors of the reproducibility study do not report any difficulty in implementing PADAM.

The reproducibility study considers the same hyperparameters as the original work for most of the results presented. There is a sub-section that considers interaction between a key parameter of PADAM (“p”) and the learning rate, which is novel. Unfortunately results for this were not finished and the plots only show a small number of training iterations; not enough to draw useful conclusions. In the other experiments, the reproducibility study follows directly from the original paper. I was surprised to see that the reproducibility study did not comment more on some differences. For example, the test error in Fig.2 of the reproducibility study seems to have much more variance than corresponding plots in Fig.2 of the original work. Do you agree with this observation? Is there a reason for this? Same comment for Fig.4 of the reproducibility study vs Fig.1 of the original work. In this latter case the error also seems higher in the reproducibility study (not just the variance). Finally, comparing Table 2 in the reproducibility study and Table 1 in the original work, it seems learning is slower in the reproduced work, especially looking at the accuracy at epoch 50. Can you discuss these differences?

Overall, the reproducibility study does not make specific recommendations to the authors, except to note that it would be a good direction to further explore how to set the hyper-parameter “p” in PADAM. The paper would be improved by a more in-depth discussion of results throughout.

Confidence : 4

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 5 Reviewer 2 comment : The authors (of the reproducibility report) attempted to reproduce the cifar-10 and cifar-100 experiments in the Padam paper. The authors plotted training and test curves. However, the training error nor the test error does not seem to match (or even come close the results in the paper). Despite this, the author still concluded that this method is capable of merging the benefit of both Adam and SGD. This conclusion seems to be a little surprising, since there is still a significant gap between the presented training error in the paper and that is reproduced. Confidence : 4