distillpub / post--attribution-baselines

The repository for the submission "Visualizing the Impact of Feature Attribution Baselines"
Other
16 stars 5 forks source link

Review #1 #1

Open distillpub-reviewers opened 4 years ago

distillpub-reviewers commented 4 years ago

The following peer review was solicited as part of the Distill review process.

The reviewer chose to waive anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service they offer to the community.

Distill is grateful to Ruth Fong for taking the time to review this article.


General Comments

Small suggestions to improve readability:

  1. increase size of labels on diagrams (i.e., slider ticklabels for alpha are unreadable, y-axis ticklabels on eq 4 graph is unreadable)
  2. add a bit more explanation around figure of first 4 equations (in particular, explanation of red line [eq 4] -- is this a mean or a sum?; highlight in caption more clearly that red line is sum of cumulative gradient at current alpha over all pixels)
  3. provide a brief explanation (i.e., a footnote) for how is a scalar extracted per feature [i.e., pixel] given the 3D RGB vector per feature (i.e., is the max(abs(dy/dx))) taken across color channels, as is done in Simonyan et al., 2014)?

Main weakness regarding ""Scientific Correctness & Integrity"" is a lacking discussion about related works and limitations:


Distill employs a reviewer worksheet as a help for reviewers.

The first three parts of this worksheet ask reviewers to rate a submission along certain dimensions on a scale from 1 to 5. While the scale meaning is consistently "higher is better", please read the explanations for our expectations for each score—we do not expect even exceptionally good papers to receive a perfect score in every category, and expect most papers to be around a 3 in most categories.

Any concerns or conflicts of interest that you are aware of?: No known conflicts of interest What type of contributions does this article make?: Both explanation of existing methods (i.e., integrated gradients) and presentation of novel method (i.e., expected gradients)

Advancing the Dialogue Score
How significant are these contributions? 3/5
Outstanding Communication Score
Article Structure 3/5
Writing Style 4/5
Diagram & Interface Style 3/5
Impact of diagrams / interfaces / tools for thought? 3/5
Readability 4/5
Scientific Correctness & Integrity Score
Are claims in the article well supported? 3/5
Does the article critically evaluate its limitations? How easily would a lay person understand them? 1/5
How easy would it be to replicate (or falsify) the results? 3/5
Does the article cite relevant work? 2/5
Does the article exhibit strong intellectual honesty and scientific hygiene? 2/5
psturmfels commented 4 years ago

Thank you for the detailed comments! Based on your feedback, we’ve made some changes to the article and added several new sections. In particular:

“generally missing citations and mention of other kinds of attribution methods besides path ones” “missing discussion with other highly related literature: SmoothGrad [Smilkov et al., arXiv 2017] and RISE [Petsiuk et al., BMVC 2018]”

I agree. In the name of brevity, the first draft of this article failed to cite several saliency/interpretability methods that were worth citing. In our latest draft we significantly expanded the list of articles we cite. We still don’t discuss most non-path methods in detail, as we wanted not to broaden the scope of the article too much. With that said, we added a new section “Expectations, and Connections to SmoothGrad” that discusses the formal connection between SmoothGrad and expected gradients. Throughout the article, we also discuss the concept of a “baseline” input in more generality than just in the context of path attribution methods.

“should briefly discuss that inputs being presented (interpolation between two images) are outside the training domain”

Good point. We now do briefly mention this in the section “The Pitfalls of Ablation Tests.” We also mention that there is a broad discussion about whether or not we should present images outside the training domain to our models that goes beyond the scope of our original article.

“room to improve discussion on single input choice (what about other typical choices for the baseline value besides a constant color, such as random noise or blurred input [Fong and Vedaldi, 2017])”

This another really good point. Despite being an article about baselines, our original article only presented two different baseline choices. Based on this feedback, we decided to add two new sections: “Alternative Baseline Choices” and “Averaging Over Multiple Baselines” that covers existing ideas about baselines. In particular, we discuss and visualize random noise and blurred input as baselines. We now present over 6 different baselines and several variants thereof, and hope that this provides a better and more nuanced picture of possible baselines.

“how does expected gradients stand up to other desirata for interpretability (i.e., Sanity Checks [Adebayo et al., NeurIPS 2018], Lipton ICML Workshop 2016)”

I agree that it is important to discuss how to evaluate interpretability methods in our article, which is something our first draft omitted. Our new section “Comparing Saliency Methods” does this. Although it doesn’t comprehensively evaluate all of the baselines we present, we hope that it provides some relevant discussion in this area. We don’t run more comprehensive evaluations on our baselines mostly for computational reasons: there are many different baselines, hyper-parameters and evaluation metrics to compare across and we wanted to keep the main focus on the assumptions behind each baseline rather than a quantitative assessment of them.

“provide more explanation for sum of cumulative gradients for expected gradients (i.e., why is it desirable that the red line is close to the blue line? what does that mean?)”

I tried to do this in our new draft. We have some additional discussion on the completeness axiom right after the third figure.

“more examples beyond the 4 that are used throughout would be appreciated (i.e., in the last figure, consider adding more examples; why does the owl example look ""better"" for integrated gradients than for expected gradients)” “to improve reproducibility, having a ""repro in colab notebook"" button for at least one of the figures would be a nice to have”

Unfortunately, I didn’t address either of these points. I didn’t address the former because when I tried adding additional examples to the figure, they got quite cluttered and took a while to load. I haven’t addressed the latter simply because I felt the other points were more important to address. I will make an effort to add a collab for at least one figure in the future.

Overall, I hope that we’ve managed to address at least some of your main concerns: especially those regarding omissions with respect to existing ideas in literature. We’ve made significant efforts to better place this article around existing ideas, and will continue to do so based on future feedback.