Closed Alfo5123 closed 5 years ago
@reproducibility-org complete
We made a modification to the final report submission. Thanks @reproducibility-org complete
Hi, please find below a review submitted by one of the reviewers:
Score: 9 Reviewer 1 comment : * Problem statement They understand the motivation of the paper, which was also very clear and well presented in the original paper.
Code They implemented both the baseline (VAE) and the proposed method by hand, as per the parameters provided in the paper. No inline documentation is provided with the code, but their code is well organized and divided to modules, so it is easily readable.
Communication with original authors The authors of this report have communicated with the authors of the original paper via openReview platform. They specifically asked about the batch size that was used in the experiments, and the hardware/time taken for the experiments. They original authors responded to them, provided the missing hyperparameters, and confirmed that the time taken by the authors of this report is very similar to the time needed by the original authors. Finally, the authors shared their findings and suggestions with the original authors and confirmed getting similar results to what the paper presented.
Hyperparameter Search The authors were mainly focused on replicating the results in the original paper. Although they didn’t do hyperparameter tuning, they ran a comprehensive set of experiments which, in my opinion, is good enough.
Ablation Study The authors did evaluate if the resulting sparse representations are interpretable features as the original authors argue. This was done by interpolating the latent code and observing how the resulting image vary in term digit, orientation (for MNIST digits) or skin and hair color (for CelebA dataset)
Discussion on results The authors provide a detailed discussion of the results. The main point, of course, being that they managed to replicate most of the results except for a minor part, where they trained their model for more epochs than what’s mentioned in the original paper and they noticed that the performance was better.
Recommendations for reproducibility The authors provided very useful insights on how to enhance the performance of the specified model. Specifically, they suggested using a ConvNet decoder instead of the FFN that the authors used. In their opinion, using a ConvNet will enhance the quality of the images and reduce the blurriness. The original authors agreed with almost all of their comments and future suggestions.
Overall organization and clarity The report is very clear and easy to read. The authors clearly put so much effort in preparing it. The code is provided in python notebooks, and the model weights were also provided if someone wants to replicate the results of this report as well. It is important to note, though, that the original paper was very clear and organized as well, and contained most of the needed details, as the authors of this report mentioned.
Confidence : 5
Hi, please find below a review submitted by one of the reviewers:
Score: 9 Reviewer 3 comment : PROBLEM STATEMENT The problem is very clearly presented in a self-contained fashion.
CODE The code is developed from scratch. Routines to download and pre-process the data were also included.
COMMUNICATION WITH THE ORIGINAL AUTHORS There is no reference to communication with authors. The report actually mentions some missing reproducibility details from the ICLR submission, so it seems no communication was established with the authors..
HYPERPARAMETER SEARCH There is no explicit mention to the work on trying to obtain better hyperparameters. ABLATION STUDIES No ablation studies were reported. They probably do not apply either for the type of work being reproduced.
DISCUSSION ON RESULTS There are some mentions regarding how difficult it is to reproduce the results due to the missing details and instability from hyperparameters. However, the discussion is fuzzy and is diluted in several other observations.
RECOMMENDATIONS FOR REPRODUCIBILITY It is mentioned that no batch size or weight initialization method was provided. Actually the report does not provide the weight initialization scheme used in this reproducibility effort, either. This is missing in its current form.
I also appreciated that the authors of the report included a computational cost in their analysis, details that were missing in the original ICLR submission.
DISCUSSION ON RESULTS The obtained results are clearly exposed and match the conclusions of the original ICLR submission. The study also includes some additional experiments that improve performance.
As a minor issue, I do not understand why the plots on the first row in Figure 4 have a different style than the ones on the second and third row. I assume it is not relevant, but using the same style for plots would be required in the final submission.
OVERALL ORGANIZATION AND CLARITY The paper is very well organized and clear. It was smooth and clear to read. The authors made a great effort in balancing the necessary background to understand the ICLR submission, but mostly focus on their reproducibility work.
As minor issues: Do not use capital letters in Section 1.2. Must remove full stops (.) in 1.1 “...The paper we aim to reproduce.” and 2.4.4 “...the same conceptual entity.”.
Confidence : 4
Hi, please find below a review submitted by one of the reviewers:
Score: 9 Reviewer 2 comment : Excellent implementation and documentation.
Problem statement Understood and explained problem very well.
Code Implemented from scratch (authors of original paper did not provide code). Clean code.
Communication with original authors Engaged with original authors and provided suggestions for improved reproducibility and ideas for improvements for future work.
Hyperparameter search Authors do not appear to have done an extra hyperparameter search.
Ablaton study None provided.
Discussion on results Very good discussion on reproducibility of paper, along with suggestions for the original authors.
Overall organization and clarity Well structured and well written. Confidence : 5
Meta Reviewer Decision: Accept
43