Open kweonwooj opened 6 years ago
@kweonwooj
Hi Kweon Woo,
This paper "SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability" explains how we can compare two representations in a way that is both invariant to affine transform and fast to compute with combining SVD+CCA analysis methods. Paper explains nicely the steps involved for comparing two representation on page3 and in Appendix.
[SVCCA] (https://arxiv.org/pdf/1706.05806.pdf)
I'm not able to understand the first plotted figure(figure1 from paper) on toy regression data. In description, it is written that "x-axis" along over the dataset. But, what goes for "y-axis"? e.g in first plot it says "Neurons with highest activation", I tried to replicate. But, not able to plot similarly. My best guess was it's plotted one neuron with activation over the dataset.
Can you please explain what is "y-axis" in this figure1 for all plots? Thank you!!
Abstract
Details
Analyzing value of single neuron over all train/valid dataset
Four main contributions
SVCCA
Result
SVCCA for Conv Layers
Applications
SVCCA similarity in singular value, analogous to multidimensional Pearson correlation
Freeze Training : freezing lower layers during training dynamically reduces training cost and improves generalization, motivated by below image where lower layers are similar to fully trained model
When are Classes learned?
compare logit and all layers's SVCCA to observe when the class-specific information is obtained
easier tasks are learned in early stages
Model Compression
Replacing usual
W x X
into(W x P_x_transpose) * (P_x * X)
whereP_x
is SVCCA projection matrix, reduces the number of flops, but retaining 99% of the informationAppendix
Personal Thoughts
Link : https://arxiv.org/pdf/1706.05806.pdf Authors : Raghu et al. 2017