Closed phillipi closed 5 years ago
Thanks for your detailed feedback. We made several modifications to address your concerns.
The reviewed version was still slightly unfinished (missing citations, fireboat and streetcar comparisons did not load etc.)
These should all be polished up now, thanks! It is possible that on very old versions of firefox some interactive elements might not perform perfectly but we have tested it on the past several major versions of most popular browsers.
I found especially the future work section misleading. It is not clear how these dictionaries do relate to generative models when they are established on one particular dataset.
We’ve rewritten a few parts of the interfaces section of the future work section to further clarify these ideas.
We've tried to be very clear that these ideas are speculative, for example by framing things with "We think..." or "We hope...". That said, we believe that this kind of speculation is important for research, conveying intuition and motivation.
Similarly, for the whole section which investigates the manifold, it is important to note that those are just post-hoc observations and that the curved paths are just intriguing observations on this particular projection.
We’ve added a paragraph mentioning this: f56aeb7301a6a631dc991ac8b523ebe612df2568
In the section “Aggregating multiple images” it is not described how the class attributions are computed. Also, the attributions for the earlier layers do not make much sense so it might be useful to describe why is that the case.
Thanks! We’ve rewritten that section to clarify this, in addition to explicitly linking to a reference implementation. 3b07221ce
It is not clear whether the combination of concepts (the sand, dune and sandbar) are selected based on the attribution classes or whether based on the combination of their feature representations. It might be an interesting future direction to estimate these relations programmatically?
In principle, you could certainly do this programmatically -- for example, you could use the same techniques we used to do attribution between layers in Building Blocks, or just trace particular data points. One really cool possibility is an animation of transitions between layers. That said, this example was human selected, and then verified by looking at attributions to logits.
While the observations are really interesting, authors might more often point out that it is only cherry-picked evidence, while verifying applicability of this method would require more quantitative evaluation. It is not saying that this method might not become a useful tool for e.g. verifying adversarial examples, but it is important to note that it is only one of the possible analyses.
We’ve added some new results to the bottom of the “Further Isolating Classes” section that show the results of the technique run on thousands of images.
4/5 - Subject-matter experts would learn a lot from reading this paper. // Significant improvement or new angle over previous explanations.
5/5 - Article is thoughtfully structured, and carefully balances motivation, detail and brevity against each other.
5/5 - Text is very readable, engaging and has good flow.
5/5 - Diagrams minimize visual noise and focus the reader's attention on what's important. They make effective use of best practices (including gestalt principles and alignment, appropriate captioning and labeling, effective use of color, etc.)
5/5 - Diagrams have a transformative impact. They make concepts much easier to understand, deeply engage with, and surface insights. https://distill.pub/2017/momentum is an exemplar.
4/5 - [Between "Given the difficulty of the topic, the writing can be understood with reasonable effort." and "Considering the difficulty of the subject, the paper does a remarkable job at explaining the topic...."]
3/5 - Claims in paper are reasonably supported, as appropriate based on the paper's framing of them. Major caveats are noted.
4/5 - Between "The article acknowledges limitations, but may not be accessible beyond a research audience." and "The article critically examines its weaknesses, and communicates them clearly. A lay person or journalist reading this article reasonably carefully would understand the limitations of the contribution."
5/5 - "Active reproducibility." Results are easy to reproduce and build on. For example, authors may provide hosted notebooks that allow re-running their experiments without even setting up infrastructure.
4/5 - Article strikes a good balance between keeping the article tight and orienting the reader in related work / being academically generous. May use an appendix or footnotes in balancing these needs.
5/5 - Takes significant steps beyond typical norms to facilitate reproduction, go out of their way to show way they could be wrong, etc.
Thanks for the revisions! I think these adequately address the reviewer's concerns.
As a suggestion, I think the tone of the article could still be muted a bit, to reflect the exploratory nature of this research. The revisions do a good of pointing out limitations. However, as Reviewer 1 noted, some of the observations come off as slight exaggerations, and I think this is partially due to using strong words. Here are a few examples where I think the tone could be adjusted: "By combining these two techniques, we can get the best of both worlds." (best) "These atlases not only reveal nuanced visual abstractions within a model." (nuanced) "Activation Atlases still give us an unprecedented overview of what neural networks can represent." (unprecendented)
Thanks @phillipi for the suggestions. I've incorporated them here along with a few more along the same lines.
Thanks! I think this will mitigate potential criticism readers could have since many of the observations are qualitative.
The following peer review was solicited as part of the Distill review process. The review was formatted by the editor to help with readability.
The reviewer chose to keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.
Distill is grateful to the reviewer for taking the time to write the review.
What type of contributions does this article make?
Explanation of existing results
How significant are these contributions
4/5
Communication:
Is the article well-organized, focused and structured?
5/5
Is the article well-written? Does it avoid needless jargon?
5/5
Diagram & Interface Style
5/5
Impact of diagrams / interfaces / tools for thought?
5/5
How readable is the paper, accounting for the difficulty of the topic?
4/5
Comments on communication:
The reviewed version was still slightly unfinished (missing citations, fireboat and streetcar comparisons did not load etc.)
Scientific correctness & integrity:
Are experiments in the article well designed, and interpreted fairly?
3/5
Does the article critically evaluate its limitations? How easily would a lay person understand them?
4/5
How easy would it be to replicate (or falsify) the results?
5/5
Does the article cite relevant work?
4/5
Considering all factors, does the article exhibit strong intellectual honesty and scientific hygiene?
5/5
Comments on scientific correctness & integrity:
In certain cases, the observations are slightly exaggerated. Especially in the "new interfaces", the future of the method seem slightly over-hyped. The work shows an interesting idea of combining existing techniques for better undestarnding of hidden representations, which produce interesting qualitative observations for a curious reader. However, it only helps to understand the inner working of one particular model for image classification, and the relationship to generative models is unclear. Especially when the main message of this work seem to be that the concepts are data-dependent (not only by sub-sampling the patches, but it also clearly shows that the discriminability within the dataset is the key).
General comments:
This submission shows that visualisation of feature vectors obtained by averaging values in a discriminable reprojection of a subset of patches from the training dataset. Authors show empirically on selected examples, that these centroids have interesting semantical properties.
The execution of this submission is exceptional. It caters especially to curious readers who are intrigued by the fact that it is extremely hard to interpret the hidden image representations. However, at certain points it feels that the presentation is slightly exaggerating the results.
While the observations are really interesting, authors might more often point out that it is only cherry-picked evidence, while verifying applicability of this method would require more quantitative evaluation. It is not saying that this method might not become a useful tool for e.g. verifying adversial examples, but it is important to note that it is only one of the possible analyses.
I found especially the future work section misleading. It is not clear how these dictionaries do relate to generative models when they are established on one particular dataset. Especially when most of the results verify the built-in biases of the ILSVRC dataset. As such, it might be advisable to be more cautious in assuming possible applications.
Similarly, for the whole section which investigates the manifold, it is important to note that those are just post-hoc observations and that the curved paths are just intriguing observations on this particular projection.
All in all, I find this work of exceptional quality and some of the presented results are really interesting. However, in my opinion, it would be useful to more point out the caveats of qualitative evaluation and the dangers of purely empirical evidence in the main text. As such, I think some statements are slightly exaggerated. However, authors discuss most of the mentioned limitations in the conclusions.
Minor issues. Please note that some of the might be caused by my misunderstandings: