VahidooX / DeepLDA

An implementation of Deep Linear Discriminant Analysis (DeepLDA) in Keras
MIT License
41 stars 10 forks source link

Not solving the good equation #1

Open tchaton opened 6 years ago

tchaton commented 6 years ago

Hello,

I am converting this code to tensorflow even if it is a little bit complicated. While studying the code, I found that you are solving evals_t = T.slinalg.eigvalsh(Sb_t, St_t) where it should be evals_t = T.slinalg.eigvalsh(Sb_t, Sw_t) to follow the equation. And in the original code from scipy.linalg.decomp import eigh evals, evecs = eigh(Sb, Sw) My email adress is Thomas.Chaton@uk.fujitsu.com is you want to talk about it.

RoyiAvital commented 6 years ago

By the way, where is the gradient calculation of the objective? Or are you relying on Auto Grad?

tchaton commented 6 years ago

Yeah, we are relying on autoGrad to do it.

2018-05-25 20:16 GMT+00:00 Royi notifications@github.com:

By the way, where is the gradient calculation of the objective? Or are you relying on Auto Grad?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392168347, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHcBoa-iRnLYjEXtLKIAugw41yYkKks5t2GajgaJpZM4S-TPP .

RoyiAvital commented 6 years ago

@tchaton , Is it smooth? As I'd assume the need for sub gradient as the objective function includes threshold.

tchaton commented 6 years ago

In theano it works fine. I implemented in Tensorflow, but it doesn t support it. The loss was exploding even with matrice simplificaiton ... I was suprised people were able to implement autoGrad over a GEP solver ahahah

2018-05-25 20:21 GMT+00:00 Royi notifications@github.com:

@tchaton https://github.com/tchaton , Is it smooth? As I'd assume the need for sub gradient as the objective function includes threshold.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392169338, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHbfQgi8b6drYGiAWc3-u2vSS1SAwks5t2GfBgaJpZM4S-TPP .

RoyiAvital commented 6 years ago

Did you verify results of the Auto Grad are like in the appendix of the paper?

tchaton commented 6 years ago

I didn t . I assumed it was tested before hand.

2018-05-26 5:56 GMT+00:00 Royi notifications@github.com:

Did you verify results of the Auto Grad are like in the appendix of the paper?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392239314, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHUvh6ms1g7Hc6AkK7SYZdk0emFf0ks5t2O6igaJpZM4S-TPP .

RoyiAvital commented 6 years ago

It seems it hasn't been.

zorrocai commented 6 years ago

@tchaton I have written the deeplda in pytorch.But it seems that it doesn't work well too. Insteading of the loss exploding problem in your tensorflow version, the training problem in pytorch version is that all the eigenvalues converge to 1,and the trace ratio matrix converges to identity matrix. I found the reason for this phenomenon is the formula:Sw+lambda*I. But when I remove this line code, there comes out singular matrix problem during training. Do you have any thoughts over this problem?

tchaton commented 6 years ago

the Lambda*I is here to prevent det(SW) to become =0 and prevent inv. It should work better with it. But I have no idea why it doesn't work. I had reported the error on tensorflow, never got an answer from the team. And it is a very important feature.

2018-05-26 13:08 GMT+00:00 zorrocai notifications@github.com:

@tchaton https://github.com/tchaton I have written the deeplda in pytorch.But it seems that it doesn't work well too. Insteading of the loss exploding problem in your tensorflow version, the training problem in pytorch version is that all the eigenvalues converge to 1,and the trace ratio matrix converges to identity matrix. I found the reason for this phenomenon is the formula:Sw+lambda*I. But when I remove this line code, there comes out singular matrix problem during training. Do you have any thoughts over this problem?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392260342, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHQgRUVVU8EVeUvtjMek991P6GLeiks5t2VPRgaJpZM4S-TPP .

tchaton commented 6 years ago

But yet, by looking at the paper, this code doesn t solve the correct equation ...

zorrocai commented 6 years ago

Thanks. there are so many hard obstacles in the deeplda model...

tchaton commented 6 years ago

Can you send me the pytorch code ?

2018-05-26 13:53 GMT+00:00 zorrocai notifications@github.com:

Thanks. there are so many hard obstacles in the deeplda model...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392263009, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHTGZLIFSuezkXNdGLxsSNn60EiV4ks5t2V5KgaJpZM4S-TPP .

zorrocai commented 6 years ago

@tchaton Of course, sending pytorch code to your email Thomas.Chaton@uk.fujitsu.com.

tchaton commented 6 years ago

Thanks you !

2018-05-27 3:20 GMT+00:00 zorrocai notifications@github.com:

@tchaton https://github.com/tchaton Of course, sending pytorch code to your email Thomas.Chaton@uk.fujitsu.com.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392302511, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHR3xBo_n1QrOx24w7BLJTdttU3Ywks5t2huIgaJpZM4S-TPP .

dmatte commented 6 years ago

Hi,

this is Matthias from the theano DeepLDA repo. I just wanted to comment on some of your comments.

@tchaton

evals_t = T.slinalg.eigvalsh(Sb_t, St_t) where it should be evals_t = T.slinalg.eigvalsh(Sb_t, Sw_t) to follow the equation. And in the original code from scipy.linalg.decomp import eigh evals, evecs = eigh(Sb, Sw)

Both versions should be fine. Actually, the two formulations should be equivalent. You can find a reference on this in Section 2 of this paper: http://mi.eng.cam.ac.uk/~cipolla/publications/inproceedings/2007-CVPR-Kim-incremental.pdf

I also just checked the two implementations on mnist and both worked in my theano implementation.

@RoyiAvital

Did you verify results of the Auto Grad are like in the appendix of the paper?

When we did our theano implementation and experiments we also relied on autograd. For the paper (appendix) we wanted to also provide a sketch on how to get to the gradients of the DeepLDA objective. The actual gradients of the generalized eigenvalues that apply in DeepLDA can be found in theano#s EigvalshGrad operator: https://github.com/Theano/theano/blob/d395439aec5a6ddde8ef5c266fd976412a5c5695/theano/tensor/slinalg.py#L385-L440

This should be in line with what is provided in Section 2.1 Equation (4) (Generalized Eigen Value Problems) of De Leeuw 2007 (http://gifi.stat.ucla.edu/speed/deleeuw_R_07c.pdf)

Regarding the gradients (and numerical stability) in PyTorch and Tensorflow I unfortunately can't comment. As I said, when we first tried the idea with theano everything worked fine for us from the beginning so we did not investigate this in detail.

tchaton commented 6 years ago

Hello,

wow you did a great job. Did you try to check the part of my code where I am using cholesky decomposition to simplify the equation. I was also thinking to use Penrose_inverse in the case the matrix is not inversible for some reasons https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_inverse Do you have any ideas about those ?

Best, Tom

2018-05-29 12:41 GMT+00:00 zorrocai notifications@github.com:

@tchaton https://github.com/tchaton @dmatte https://github.com/dmatte I think the problem that caused the loss exploding problem in your tensorflow version or the poor loss in pytorch version may be the different eigenvalue decomposition methods. there are two basic different eigen decompostion methods in both scipy and numpy: eig eigenvalues and right eigenvectors for non-symmetric arrays eigh Eigenvalues and right eigenvectors for symmetric/Hermitian arrays. and they don't get the same eigenvalues, scipy's eig and eigh are totally different. numpy's eig and eigh return the same eigenvalues in different order. But numpy's eigh returns the different eigenvalues from scipy's eigh. Does this mean A\vec{e{i}} = v{i} B\vec{e{i}} has the different eignevalues with \frac{A}{B}\vec{e{i}} = v{i}\vec{e{i}}? What's more I compared the torch.eig with above methods, torch.eig have the same eigenvalues as numpy.eig. Though s_b/s_w is symmetric, torch.svd retruns singular values with minus on some eigenvalues. like this:

import numpy as np import scipy import torch a = np.random.rand(1000,10) b = np.random.rand(1000,10) s_w = np.cov(a.T) s_b = np.cov(b.T) from scipy.linalg.decomp import eigh evals, evecs = eigh(s_b, s_w) evals, evecs = scipy.linalg.eigh(s_b, s_w) evals array([ 0.82751224, 0.8575706 , 0.90092649, 0.93457503, 0.96302211, 1.00898617, 1.02661205, 1.10996722, 1.14103357, 1.26663491]) evals, evecs = scipy.linalg.eig(s_b, s_w) evals array([ 1.26663491+0.j, 1.14103357+0.j, 1.10996722+0.j, 0.82751224+0.j, 0.85757060+0.j, 0.90092649+0.j, 0.93457503+0.j, 0.96302211+0.j, 1.02661205+0.j, 1.00898617+0.j]) evals, evecs = np.linalg.eigh(s_b/s_w) evals array([-14.82691125, -9.93629821, -3.0339017 , -0.99347968, -0.23726914, 1.46624221, 2.76740404, 4.01231621, 12.13940769, 18.58078932]) evals, evecs = np.linalg.eig(s_b/s_w) evals array([ 18.58078932, -14.82691125, 12.13940769, -9.93629821, -3.0339017 , 4.01231621, 2.76740404, -0.99347968, -0.23726914, 1.46624221]) s_w = torch.from_numpy(s_w) s_b = torch.from_numpy(s_b) evals,evecs = torch.eig(s_b/s_w) evals tensor([[ 18.5808, 0.0000], [-14.8269, 0.0000], [ 12.1394, 0.0000], [ -9.9363, 0.0000], [ -3.0339, 0.0000], [ 4.0123, 0.0000], [ 2.7674, 0.0000], [ -0.9935, 0.0000], [ -0.2373, 0.0000], [ 1.4662, 0.0000]], dtype=torch.float64)

u,s,v= torch.svd(s_b/s_w) s tensor([ 18.5808, 14.8269, 12.1394, 9.9363, 4.0123, 3.0339, 2.7674, 1.4662, 0.9935, 0.2373], dtype=torch.float64) torch.equal((s_b/s_w).t(),s_b/s_w) True

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392761629, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHdljnNsC55Wy14YzfrItnxVXcCT6ks5t3UIFgaJpZM4S-TPP .

tchaton commented 6 years ago

Hello,

I have found this very interesting code : https://github.com/roatienza/Deep-Learning-Experiments/blob/master/Experiments/Tensorflow/Math/decomposition.py

Make sure to look at it :)

Best, Tom

2018-06-05 21:34 GMT+00:00 thomas chaton thomas.chaton.ai@gmail.com:

Hello,

wow you did a great job. Did you try to check the part of my code where I am using cholesky decomposition to simplify the equation. I was also thinking to use Penrose_inverse in the case the matrix is not inversible for some reasons https://en.wikipedia.org/wiki/ Moore%E2%80%93Penrose_inverse Do you have any ideas about those ?

Best, Tom

2018-05-29 12:41 GMT+00:00 zorrocai notifications@github.com:

@tchaton https://github.com/tchaton @dmatte https://github.com/dmatte I think the problem that caused the loss exploding problem in your tensorflow version or the poor loss in pytorch version may be the different eigenvalue decomposition methods. there are two basic different eigen decompostion methods in both scipy and numpy: eig eigenvalues and right eigenvectors for non-symmetric arrays eigh Eigenvalues and right eigenvectors for symmetric/Hermitian arrays. and they don't get the same eigenvalues, scipy's eig and eigh are totally different. numpy's eig and eigh return the same eigenvalues in different order. But numpy's eigh returns the different eigenvalues from scipy's eigh. Does this mean A\vec{e{i}} = v{i} B\vec{e{i}} has the different eignevalues with \frac{A}{B}\vec{e{i}} = v{i}\vec{e{i}}? What's more I compared the torch.eig with above methods, torch.eig have the same eigenvalues as numpy.eig. Though s_b/s_w is symmetric, torch.svd retruns singular values with minus on some eigenvalues. like this:

import numpy as np import scipy import torch a = np.random.rand(1000,10) b = np.random.rand(1000,10) s_w = np.cov(a.T) s_b = np.cov(b.T) from scipy.linalg.decomp import eigh evals, evecs = eigh(s_b, s_w) evals, evecs = scipy.linalg.eigh(s_b, s_w) evals array([ 0.82751224, 0.8575706 , 0.90092649, 0.93457503, 0.96302211, 1.00898617, 1.02661205, 1.10996722, 1.14103357, 1.26663491]) evals, evecs = scipy.linalg.eig(s_b, s_w) evals array([ 1.26663491+0.j, 1.14103357+0.j, 1.10996722+0.j, 0.82751224+0.j, 0.85757060+0.j, 0.90092649+0.j, 0.93457503+0.j, 0.96302211+0.j, 1.02661205+0.j, 1.00898617+0.j]) evals, evecs = np.linalg.eigh(s_b/s_w) evals array([-14.82691125, -9.93629821, -3.0339017 , -0.99347968, -0.23726914, 1.46624221, 2.76740404, 4.01231621, 12.13940769, 18.58078932]) evals, evecs = np.linalg.eig(s_b/s_w) evals array([ 18.58078932, -14.82691125, 12.13940769, -9.93629821, -3.0339017 , 4.01231621, 2.76740404, -0.99347968, -0.23726914, 1.46624221]) s_w = torch.from_numpy(s_w) s_b = torch.from_numpy(s_b) evals,evecs = torch.eig(s_b/s_w) evals tensor([[ 18.5808, 0.0000], [-14.8269, 0.0000], [ 12.1394, 0.0000], [ -9.9363, 0.0000], [ -3.0339, 0.0000], [ 4.0123, 0.0000], [ 2.7674, 0.0000], [ -0.9935, 0.0000], [ -0.2373, 0.0000], [ 1.4662, 0.0000]], dtype=torch.float64)

u,s,v= torch.svd(s_b/s_w) s tensor([ 18.5808, 14.8269, 12.1394, 9.9363, 4.0123, 3.0339, 2.7674, 1.4662, 0.9935, 0.2373], dtype=torch.float64) torch.equal((s_b/s_w).t(),s_b/s_w) True

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392761629, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHdljnNsC55Wy14YzfrItnxVXcCT6ks5t3UIFgaJpZM4S-TPP .

zorrocai commented 6 years ago

Hi, @tchaton Yeah, for symmetric matrix,eigenvalues are the same as singular values. image you can see it in matrix cookbook for detail. https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf And I found that I have made a mistake in the equation it should the same as , NOT . Cause that . What is more, though Sw(within scatter matrix) and Sb(between scatter matrix) are both symmetric matrixes, but is not symmetric.

Best, Zorro

RoyiAvital commented 6 years ago

@zorrocai ,

Singular Values are the Eigen Values of $ {A}^{T} A $ or $ A {A}^{T} $ (Which share the same eigen values). Hence the connection between the two is usually $ \sigma {\left( A \right)}{i} = \lambda {\left( A \right)}{i}^{2} $.

zorrocai commented 6 years ago

@RoyiAvital Thanks

nikparth commented 6 years ago

Hi @zorrocai I am also attempting to work on the deep-lda generalized eigenvalue problem loss in pytorch so I was wondering if you might be willing to share your code as well with me. My email is nikparth@gmail.com. Thanks!

madarax64 commented 5 years ago

Hello, Has anyone had any luck with porting the code to PyTorch?

magnum-zx commented 3 years ago

@zorrocai hello, did you succeed in implementing deeplda on Pytorch, the ldaloss in my code doesn't work.......