DaniFojo commented 5 years ago

87

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 7 Reviewer 1 comment : This reproducibility challenge attempts to reproduce PA-GAN results. From their description in the report, it is clear that they understand the concept of the paper they are trying to reproduce. The code for this reproduction was implemented almost from scratch, with assistance from component code pieces for evaluation. The code was readable, but a script or directions to re-run the experiments would have been valuable. The participants also contacted the authors on the OpenReview forum and clarified issues they came across while implementing the code. They note that the authors were responsive to their queries. While the participants seem to have found a set of hyperparameters that works partially in reproduction, they do not detail the hyperparameter search that was done. It also seems to me that the results shown were for a single random seed. Variability across seeds might provide additional insight into reproducibility. Finally, the discussions and recommendations from this instance are fairly straightforward and acceptable. Confidence : 4

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 6 Reviewer 3 comment :

Problem statement - The problem statement is well-formed and the authors clearly understood the purpose of the original work. 
Code: The authors reproduced the paper more or less from scratch, the code seems to be fairly clear, but the README should be updated with instructions on how to run the code and a requirements.txt to make sure that version numbers for requirements used are known in the future.  
Communication with original authors 
- The authors provided a detailed. description their communication with the original authors. This seemed very helpful in clarifying parts of the original paper and finding issues in the original description. It seems that this reproduction and accompanying paper is particularly useful to the community as it clears up some things about the original work.
Hyperparameter Search 
- The reproduction doesn’t really address hyper parameters in a significant way, the authors tried to find the best hyper parameter settings by contacting the original authors, but seemed to not be able to reproduce results exactly. The authors could have detailed a grid of parameters which they attempted or run a hyperparameter search.
Ablation Study 
- The report doesn’t include extensive ablation studies, but runs several versions of the algorithm (L=2). Unfortunately, it seems that the reproduction only runs one random seed which may account for performance differences from the original paper.
Discussion on results 
- The discussion is one of the most valuable parts of this work, it clarifies issues in the description of the original work that may have been difficult to find out without contacting the original authors.  
Recommendations for reproducibility 
- The authors provide recommendations for reproducibility and feedback that may be helpful to the original authors.
Overall organization and clarity 
- The paper is more or less clear, with some very minor grammatical issues and typos. Example: “…we used a baseline a Spectral..”
- The website accompanying the paper seems out of date, I would suggest updating the website to reflect the final version of the paper. Currently, it says that some work will be finished by the deadline, but it seems that this work was completed in the paper.
Confidence : 4

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 5 Reviewer 2 comment : This paper conducts a re-implementation of the PA-GAN paper, using a new codebase developed by the authors.

Overall, the paper is a bit unpolished --- there are still places where the authors have marked reminders to edit sections in all caps, and various typos are scattered throughout the document. One of these places is the description of the PA-GAN algorithm, making it a bit hard to parse. But it is possible to understand the essence of PA-GAN by reading the paper.

The authors produce their own code for the replication, which is nice. They experiment on all 4 datasets that the authors considered. The best thing about this paper is that they contacted the authors on several occasions to clarify aspects of their paper, and it turns out that the author's implementation differed from what they stated in the paper.

That said, I have a couple of problems with the paper. One is that they do not perform all of the experiments in the paper, only the ones using NS GAN. This may be due to computational resources, but this is not really explained in the paper --- instead, the paper brushes this a bit under the rug saying that they use NS GAN because it is used in the original paper. An important part of replication is stating to what extent you are replicating, given your computational budget. I'd like to see an honest discussion of this in the paper. (This also goes for the fact that the authors did not experiment with any other hyperparameters, or vary the random seed).

I also take issue with the claim that the authors replicated the paper successfully. The FID number they report, 26.3, is the same as the NS-GAN result without PA reported in the original paper. Also, judging by the graphs the PA approach only seems to improve relative to the baseline on 2 of the 4 datasets. I think this would be worth discussing in more detail.

Overall, I really appreciate this replication for providing open-source code and for probing into the details of the author's original implementation. However, it does not seem like a full replication to me, only a limited one. I'd really like for this aspect to be discussed in more detail in the paper, rather than being swept under the rug. Confidence : 4

reproducibility-challenge / iclr_2019

Submission for issue #87 #155

87