reproducibility-challenge / iclr_2019

ICLR Reproducibility Challenge 2019
https://reproducibility-challenge.github.io/iclr_2019/
219 stars 40 forks source link

Submission for issue #87 #155

Open DaniFojo opened 5 years ago

DaniFojo commented 5 years ago

87

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 7 Reviewer 1 comment : This reproducibility challenge attempts to reproduce PA-GAN results. From their description in the report, it is clear that they understand the concept of the paper they are trying to reproduce. The code for this reproduction was implemented almost from scratch, with assistance from component code pieces for evaluation. The code was readable, but a script or directions to re-run the experiments would have been valuable. The participants also contacted the authors on the OpenReview forum and clarified issues they came across while implementing the code. They note that the authors were responsive to their queries. While the participants seem to have found a set of hyperparameters that works partially in reproduction, they do not detail the hyperparameter search that was done. It also seems to me that the results shown were for a single random seed. Variability across seeds might provide additional insight into reproducibility. Finally, the discussions and recommendations from this instance are fairly straightforward and acceptable. Confidence : 4

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 6 Reviewer 3 comment :

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 5 Reviewer 2 comment : This paper conducts a re-implementation of the PA-GAN paper, using a new codebase developed by the authors.

Overall, the paper is a bit unpolished --- there are still places where the authors have marked reminders to edit sections in all caps, and various typos are scattered throughout the document. One of these places is the description of the PA-GAN algorithm, making it a bit hard to parse. But it is possible to understand the essence of PA-GAN by reading the paper.

The authors produce their own code for the replication, which is nice. They experiment on all 4 datasets that the authors considered. The best thing about this paper is that they contacted the authors on several occasions to clarify aspects of their paper, and it turns out that the author's implementation differed from what they stated in the paper.

That said, I have a couple of problems with the paper. One is that they do not perform all of the experiments in the paper, only the ones using NS GAN. This may be due to computational resources, but this is not really explained in the paper --- instead, the paper brushes this a bit under the rug saying that they use NS GAN because it is used in the original paper. An important part of replication is stating to what extent you are replicating, given your computational budget. I'd like to see an honest discussion of this in the paper. (This also goes for the fact that the authors did not experiment with any other hyperparameters, or vary the random seed).

I also take issue with the claim that the authors replicated the paper successfully. The FID number they report, 26.3, is the same as the NS-GAN result without PA reported in the original paper. Also, judging by the graphs the PA approach only seems to improve relative to the baseline on 2 of the 4 datasets. I think this would be worth discussing in more detail.

Overall, I really appreciate this replication for providing open-source code and for probing into the details of the author's original implementation. However, it does not seem like a full replication to me, only a limited one. I'd really like for this aspect to be discussed in more detail in the paper, rather than being swept under the rug. Confidence : 4