reproducibility-challenge / iclr_2019

ICLR Reproducibility Challenge 2019
https://reproducibility-challenge.github.io/iclr_2019/
219 stars 40 forks source link

VISUAL EXPLANATION BY INTERPRETATION: IMPROVING VISUAL FEEDBACK CAPABILITIES OF DEEP NEURAL NETWORKS #157

Open Krestone opened 5 years ago

Krestone commented 5 years ago

Issue Number: 101 https://github.com/reproducibility-challenge/iclr_2019/issues/101

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 5 Reviewer 1 comment : [Problem statement] is clear:

[Code] the authors reproduced from scratch, and make their code available at https://github.com/Krestone/iclr_2019_code It would be nice to also add this link in the report to find the code base easily. Documentation of the code should be added.

[Communication with original authors] The report does not mention communications with the original authors for testing reproducibility.

[Hyperparameter Search] Due diligence is shown in hyperparameter sweep.

[Ablation Study] The report provides an ablation study similar to the original paper. However a lack of technical details makes it hard to understand what could be reproduced from the original paper, and what could not.

[Discussion on results] The report does not contains detailed discussion on the state of reproducibility of the original paper:

[Recommendations for reproducibility] The authors don't explicitely provide recommendations to the original authors for improving reproducibility, but they mention their difficulties using the same framework as for the original paper.

[Overall organization and clarity]

Score: 5 (reject) More details and further experiments could make this report good enough for acceptance. Confidence : 3

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 4 Reviewer 1 comment : [Problem Statement] The report states and understands the problem statement of the original paper, but, in my opinion, does not convey this information very clearly. Without reading the original paper begin analyzed, one would have a limited understanding of exactly what is being achieved/how. The authors do a good job at reproducing the MNIST ablations, and I think this study contributes a lot of value. However, many of the empirical comparisons in the original paper are not including in the report (other datasets and visualizations), but the authors do explicitly state that they will only focus on the MNIST ablations.

[Code] The authors reproduced the code from scratch. Documentation is not polished, but is sufficient to figure out how to use/read their code.

[Communication with original authors] N.A.

[Hyperparameter Search] The authors test an additional network and more sparsity parameters for their MNIST ablation.

[Ablation Study] The root of the empirical comparison is an ablation study. As mentioned, the authors extend their ablations to remove a larger number of filters.

[Discussion on results] Limited to a few terse sentences, however there isn't much else to be added. I found this sufficient given the small set of experiments performed.

[Recommendations for reproducibly] There don't appear to be any recommendations provided in the report.

[Organization and clarity] The organization of the paper is good. However, the overall clarity and exposition of the report can be vastly improved. Figure 1 is relatively incomprehensible, and there is little information conveyed in the text. This is quite a shame since the authors provide valuable MNIST ablations, and, with a little more work, the quality of the report would have been drastically improved. I think improving the clarity of the writing, the transitions between sections, the code documentation, and the addition of a single ablation on ImageNet-cats or Fashion144K would qualify the paper for acceptance. Confidence : 4

reproducibility-org commented 5 years ago

Hi, please find below a review submitted by one of the reviewers:

Score: 5 Reviewer 3 comment : [Problem Statement]

The problem statement is clear and the reproducers appear to have understood the paper, although could have paraphrased better its contributions paragraph (ref: End page 2). I agree with the reproducers' decisions on where to focus for the reproduction effort.

[Code]

The authors re-implemented from scratch, so this is an attempt at true reproduction rather than repeatability. The codebase is polyglot, with some processing done in MATLAB and some in Python/Jupyter notebooks. Documentation and commenting is slightly lacking, it would really speed up relating the code with the paper and report.

[Communication with original authors]

The reproducers have not communicated with the original authors, and in particular not on OpenReview.

[Hyperparameter Search]

Reproduction was only attempted on smaller models; The completeness of the hyperparameter search must be understood within this frame. size That being said, for the MNIST model that the reproducers studied, and for a sweep of the authors' defined parameter "mu", the results are broadly in line with what the authors obtained. This suggests that the authors' strategy for feature selection prioritizes important information over less important information.

[Ablation studies]

Ablation studies are not quite relevant to the present paper and report. Nevertheless the reproducers did attempt to compare strategic filter dropping with random filter dropping, and it is confirmed that the authors' strategy identifies important features well.

[Discussion of results] [Recommendations for reproducibility]

In the light of how little of the paper has in fact been reproduced, the results cannot be said to be a full reproduction of the paper, but the present report does server as a good smoke test, and seems to indicate that the authors' work is worthy of further, fuller reproduction attempts.

[Overall organization and clarity]

The report feels skinny and lacking in meat; Were it to be expanded to more datasets, it would be the better for it. Confidence : 4