Question regarding MNF transform

tordks commented 5 years ago

Dear Thomas,

I am working with underwater hyperspectral imagery and have started to use your package (spectral_python). It works wonderfully, but I have a question regarding the implementation of the MNF. I have been looking through "A Faster Way to Compute the Noise-Adjusted Principal Components Transform Matrix" by R. E. Roger and the original MNF paper which you reference in the code.

On line 1661 in spectral/algorithms/algorithms.py it says:

C = noise.sqrt_inv_cov.dot(signal.cov).dot(noise.sqrt_inv_cov)

I assume this is meant to be the noise adjusted covariance. However, I cannot get this to match the methodology in the papers. By doing an eigenvalue decomposition I get that C=EC_adjE.T. Where E is the matrix of eigenvectors for the noise covariance matrix and C_adj is the noise adjusted covariance as defined in the mentioned papers.

My guess is that it is ok to use C since C and C_adj are similar matrices with the same eigenvalues, however I am not sure if the eigenvectors have the same directions in C and C_adj. If the eigenvector matrix for C is V and the eigenvector matrix for C_adj is G I get that G = E.TV. From my tests it seems that they do not point in the same direction and hence I don't fully understand how C_adj can be substituted by C.

Is there a particular reason C is defined as it is? The results I get by applying the MNF seems to be reasonable, so it is probably just something I overlook. I can write up the calculation above more formally in latex if that is preferable.

Best regards Tord

tboggs commented 5 years ago

Hi Tord,

The line you reference is simply transforming the signal covariance into the whitened space of the noise covariance. Immediately after that, I compute the eigenvalues & eigenvectors w.r.t. that transformed covariance. Note that those eigenvalues are equal to the estimated SNR + 1. Those results are then used in the MNFResult methods a few lines up in the file, depending on whether we are denoising or reducing dimensionality. Take a look at MNFResult.denoise and MNFResult.reduce and see if those methods make sense.

Regards Thomas

tordks commented 5 years ago

Thank you for the response. From the papers the transform to the whitened space is given by C_adj = F.T.dot(signal_cov).dot(F) where F is defined as E.dot(delta_noise^(-1/2)). E is the eigenvectors of the noise covariance and delta_noise is the diagonal matrix of eigenvalues for the noise covariance.

However, the squared inverse of the covariance matrix is by eigenvalue decomposition:

noise.sqrt_inv_cov = E.dot(delta_noise^(-1/2)).dot(E.T) = E.dot(F.T).

such that line 1661 becomes:

C = noise.sqrt_inv_cov.dot(signal.cov).dot(noise.sqrt_inv_cov) = E.dot(C_adj).dot(E.T)

Which seems like the transformation from the articles, but premultiplied by E and postmultiplied by E.T. C and C_adj are similar matrices with the same eigenvalues, but unless their eigenvectors point in the same direction I am not sure it is the same to use C and C_adj.

To summarize: I am confused since the whitening transformation seems to be different in the paper and in the code. Would you be so kind to enlighten me in why C_adj can be interchanged with C, or if I am making a miscalculation?

tboggs commented 5 years ago

Line 1661 is computing what you are calling C_adj (the covariance of the data in the noise-whitened space). Could the difference be due to me storing eigenvectors in rows vs. the paper using columns?

The way that you describe noise.sqrt_inv_cov is the same way I compute it (see the definition of matrix_sqrt).

Just to show that the whitening is working as expected:

In [1]: import numpy as np

In [2]: import spectral as spy

In [3]: X = np.random.rand(3000).reshape(1000, 3)

In [4]: m = np.random.rand(3)

In [5]: C = np.cov(X, rowvar=False)

In [6]: stats = spy.GaussianStats(mean=m, cov=C)

In [7]: siC = stats.sqrt_inv_cov

In [8]: C
Out[8]: 
array([[ 0.08441659,  0.00126852,  0.00561934],
       [ 0.00126852,  0.08378285,  0.00023946],
       [ 0.00561934,  0.00023946,  0.08308598]])

Now, let's whiten the covariance:

In [9]: siC.dot(C).dot(siC)
Out[9]: 
array([[  1.00000000e+00,  -4.16401397e-17,   5.55111512e-17],
       [ -1.14180041e-17,   1.00000000e+00,   4.83771009e-16],
       [  1.80411242e-16,   5.47955778e-16,   1.00000000e+00]])

Or we can whiten the data by applying the whitening transform to the original data, then compute the covariance in the whitened space:

In [10]: X_whitened = siC.dot((X - m).T).T

In [11]: X_whitened.shape
Out[11]: (1000, 3)

In [12]: np.cov(X_whitened, rowvar=False)
Out[12]: 
array([[  1.00000000e+00,  -1.15578773e-16,   1.51141473e-16],
       [ -1.15578773e-16,   1.00000000e+00,   5.06768468e-16],
       [  1.51141473e-16,   5.06768468e-16,   1.00000000e+00]])

spectralpython / spectral

Question regarding MNF transform #88