Hope is that this will fix APFL from blowing up, as norm FF.T is like 40M. Check to see what L is after fixing this.
We normalize the matrix input (batch from update) not the channels individually, since I think scale matters for PCA (eg relative scale between channels needs to be preserved).
Hope is that this will fix APFL from blowing up, as norm FF.T is like 40M. Check to see what L is after fixing this.
We normalize the matrix input (batch from update) not the channels individually, since I think scale matters for PCA (eg relative scale between channels needs to be preserved).
Normalize F before doing PCA