distillpub / post--activation-atlas

Using feature inversion to visualize millions of activations from an image classification network, we create an explorable activation atlas of features the network has learned which can reveal how the network typically represents some concepts.
https://distill.pub/2019/activation-atlas/
43 stars 10 forks source link

Review #2 #3

Closed phillipi closed 5 years ago

phillipi commented 5 years ago

The following peer review was solicited as part of the Distill review process. The review was formatted by the editor to help with readability.

The reviewer chose to keep anonymity. Distill offers reviewers a choice between anonymous review and offering reviews under their name. Non-anonymous review allows reviewers to get credit for the service them offer to the community.

Distill is grateful to the reviewer for taking the time to write the review.


What type of contributions does this article make?

Explanation of existing results

How significant are these contributions

3/5

Communication:

Is the article well-organized, focused and structured?

4/5

Is the article well-written? Does it avoid needless jargon?

4/5

Diagram & Interface Style

5/5

Impact of diagrams / interfaces / tools for thought?

5/5

How readable is the paper, accounting for the difficulty of the topic?

4/5

Comments on communication:

Spelling/grammar errors in the colab notebook: Simplied, impelementations

Spelling/grammar errors in article: Not let's jump to the other side trying to understand on how the network -> trying to understand as to how the network wo classes where I would be hard-pressed -> wo classes where we would be hard-pressed increasing automating, -> increasing automation,

Misc: The images do not show on my installation of firefox. Try in a notebook feature broken for the section on "Further Isolating Classes" and at the end. [On chrome based browser].

Missing detail: The attribution of average activations to classification labels is not explained in the text, making the relevant section confusing. Please detail exact equations inline. The footnote (4) mentions that a linear approximation is used. Is that as in network Saliency (Simonyan et.al.) or gradCAM? Please elaborate inline itself as this is important to the interpretation of several results.

Missing detail: Please explain the coherency trick in more detail in the text. It is briefly discussed in the code comments of the included notebook. It is an important part of the algorithm. A precise description of the trick will fill this gap.

Missing detail: Why are some of the images in the activation atlas smaller than the grid cell size? I notice that this correlates to the number of activation vectors used in the average.

Scientific correctness & integrity:

Are experiments in the article well designed, and interpreted fairly?

4/5

Does the article critically evaluate its limitations? How easily would a lay person understand them?

4/5

How easy would it be to replicate (or falsify) the results?

5/5

Does the article cite relevant work?

3/5

Considering all factors, does the article exhibit strong intellectual honesty and scientific hygiene?

4/5

Comments on scientific correctness & integrity:

"Nguyen, et al [10] use a similar technique in conjunction with feature visualization to show different facets of a feature, but the technique still doesn’t really show us what the network can represent."

Please specify a more concrete weakness for [10]. One may also not write a weakness for [10] as the method is substantially different from activation-atlas. Also, contrary to the paragraph above, [10] is not similar to [9]. [10] does neuron maximization via direct optimization rather than t-sne on activations. Totally different methods!

"Spatial activations show important combinations of many neurons, but are focused on a single example."

Citation needed. Please include the relevant citation in the caption of each subfigure of this diagram.

"won the ILSVRC contest in 2014" … please cite ILSVRC

"InceptionV1 builds up its understanding of images over several layers (see overview from [2] ). It was trained on ImageNet [11] ". ... trained on ILSVRC.

General comments:

The article requires a numerical analysis to confirm that the visualizations are representative of the average activations. In other words, please report the residual error in the feature inversion task being used to generate these visualizations. Are the average activations invertible? If yes, to what extent.

The precise nature of the loss function used - cosine similarity * dot product - can be made clear using relevant equations. The submission briefly mentions these in a technical aside. Since this is a core component of the method albeit borrowed from feature visualization and included in the colab notebook, a mathematical description is essential. Please also detail how the two terms in the objective are balanced and the optimizer used (Gradient Descent, L-BFGS, etc.).

shancarter commented 5 years ago

Thanks for the general typo and formatting issues. These should all be addressed now, but we have lost track of the specific commits for some of them.

Not let's jump to the other side trying to understand on how the network -> trying to understand as to how the network wo classes where I would be hard-pressed -> wo classes where we would be hard-pressed increasing automating, -> increasing automation,

Fixed in 5bafb6d9af3f5

Missing detail: The attribution of average activations to classification labels is not explained in the text, making the relevant section confusing. Please detail exact equations inline. The footnote (4) mentions that a linear approximation is used. Is that as in network Saliency (Simonyan et.al.) or gradCAM? Please elaborate inline itself as this is important to the interpretation of several results.

Thanks, we’ve added a detailed explanation of this, and explicitly linked to a reference implementation. [3b07221ce](in https://github.com/distillpub/post--activation-atlas/commit/3b07221ce)

Missing detail: Why are some of the images in the activation atlas smaller than the grid cell size? I notice that this correlates to the number of activation vectors used in the average.

We’ve added a description to the caption at the top of the article clarifying this.

"Nguyen, et al [10] use a similar technique in conjunction with feature visualization to show different facets of a feature, but the technique still doesn’t really show us what the network can represent." Please specify a more concrete weakness for [10]. One may also not write a weakness for [10] as the method is substantially different from activation-atlas. Also, contrary to the paragraph above, [10] is not similar to [9]. [10] does neuron maximization via direct optimization rather than t-sne on activations. Totally different methods!

[10] does actually do a t-SNE on the activations, although it’s a bit less emphasized! Doing t-SNE on activations and then clustering images in that space is how [10] creates initial image seeds for feature visualization! However, it is still focused on individual neurons in the end and doesn’t really get at interactions between them.

We’ve updated the text to better highlight the similarities and differences.

"Spatial activations show important combinations of many neurons, but are focused on a single example." Citation needed. Please include the relevant citation in the caption of each subfigure of this diagram.

We’ve added links to relevant articles for each sub figure as of f475e28915.

The statement regarding spatial activations is intended to just be an observation. Because they are real activation vectors, they are “real” combinations of neurons that occur in real life. Because we are visualizing a specific activation vector, it is only looking at one.

"won the ILSVRC contest in 2014" … please cite ILSVRC

"InceptionV1 builds up its understanding of images over several layers (see overview from [2] ). It was trained on ImageNet [11] ". ... trained on ILSVRC.

Thanks for catching that! We hadn’t realized there was a separate citation for ILSVRC. :)

The article requires a numerical analysis to confirm that the visualizations are representative of the average activations. In other words, please report the residual error in the feature inversion task being used to generate these visualizations. Are the average activations invertible? If yes, to what extent.

Early work on feature inversion penalized L2 distance, in which case residual error was a very natural thing to measure. We aren’t really doing an “inversion” in this sense. Instead, we’re more grounded in the “activation maximization” literature. Since we want to visualize a direction in activation space instead of an individual neuron, this pans out to maximizing a dot product. This is expected to create an activation pattern that strongly activates neurons that fired strongly in the average activation, but it wouldn’t be expected to create a small residual.

(This is still something a mystery to us: empirically, optimizing L2 just doesn’t work beyond really early layers, but optimizing dot product creates meaningful visualizations, albeit exaggerated ones. Why exactly is that? It seems linked to the importance of linear directions in neural networks, but we’d like to understand it more deeply.)

The precise nature of the loss function used - cosine similarity * dot product - can be made clear using relevant equations. The submission briefly mentions these in a technical aside. Since this is a core component of the method albeit borrowed from feature visualization and included in the colab notebook, a mathematical description is essential. Please also detail how the two terms in the objective are balanced and the optimizer used (Gradient Descent, L-BFGS, etc.).

Thanks, we’ve described this in more detail: 3b07221ce

Since we are just multiplying the dot product objective by cosine similarity, there’s only one term and no balancing is needed. We use adam for the convenience of faster convergence, but plain gradient descent works fine.

colah commented 5 years ago

Numerical scores contextualized with scoring rubric entries.

What type of contributions does this article make?

Explanation of existing results

How significant are these contributions

3/5 - Results make a nice step of progress for the research community. // Better or different in some ways than existing explanations.

Communication:

Is the article well-organized, focused and structured?

4/5 - Article is thoughtfully organized and on point.

Is the article well-written? Does it avoid needless jargon?

4/5 - Text is not just readable, but engaging and flows fairly well.

Diagram & Interface Style

5/5 - [Between " Diagrams can be understood...." and "Diagrams minimize visual noise and focus the reader's attention on what's important..."

Impact of diagrams / interfaces / tools for thought?

5/5 - Diagrams have a transformative impact. They make concepts much easier to understand, deeply engage with, and surface insights. https://distill.pub/2017/momentum is an exemplar.

How readable is the paper, accounting for the difficulty of the topic?

4/5 - [Between "Given the difficulty of the topic, the writing can be understood with reasonable effort." and "Considering the difficulty of the subject, the paper does a remarkable job at explaining the topic...."]

Scientific correctness & integrity:

Are experiments in the article well designed, and interpreted fairly?

4/5 - Claims in the paper are well supported. Experiments are thoughtfully designed, paper errs towards humility, notes potential weaknesses.

Does the article critically evaluate its limitations? How easily would a lay person understand them?

4/5 - Between "The article acknowledges limitations, but may not be accessible beyond a research audience." and "The article critically examines its weaknesses, and communicates them clearly. A lay person or journalist reading this article reasonably carefully would understand the limitations of the contribution."

How easy would it be to replicate (or falsify) the results?

5/5 - "Active reproducibility." Results are easy to reproduce and build on. For example, authors may provide hosted notebooks that allow re-running their experiments without even setting up infrastructure.

Does the article cite relevant work?

3/5 - Article does a typical job of citing literature.

Considering all factors, does the article exhibit strong intellectual honesty and scientific hygiene?

4/5 - Takes moderate steps beyond typical norms to facilitate reproduction, go out of their way to show way they could be wrong, etc.

phillipi commented 5 years ago

Thanks for addressing these comments! I feel the concerns have been adequately addressed.