Open tchaton opened 6 years ago
By the way, where is the gradient calculation of the objective? Or are you relying on Auto Grad?
Yeah, we are relying on autoGrad to do it.
2018-05-25 20:16 GMT+00:00 Royi notifications@github.com:
By the way, where is the gradient calculation of the objective? Or are you relying on Auto Grad?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392168347, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHcBoa-iRnLYjEXtLKIAugw41yYkKks5t2GajgaJpZM4S-TPP .
@tchaton , Is it smooth? As I'd assume the need for sub gradient as the objective function includes threshold.
In theano it works fine. I implemented in Tensorflow, but it doesn t support it. The loss was exploding even with matrice simplificaiton ... I was suprised people were able to implement autoGrad over a GEP solver ahahah
2018-05-25 20:21 GMT+00:00 Royi notifications@github.com:
@tchaton https://github.com/tchaton , Is it smooth? As I'd assume the need for sub gradient as the objective function includes threshold.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392169338, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHbfQgi8b6drYGiAWc3-u2vSS1SAwks5t2GfBgaJpZM4S-TPP .
Did you verify results of the Auto Grad are like in the appendix of the paper?
I didn t . I assumed it was tested before hand.
2018-05-26 5:56 GMT+00:00 Royi notifications@github.com:
Did you verify results of the Auto Grad are like in the appendix of the paper?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392239314, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHUvh6ms1g7Hc6AkK7SYZdk0emFf0ks5t2O6igaJpZM4S-TPP .
@tchaton I have written the deeplda in pytorch.But it seems that it doesn't work well too. Insteading of the loss exploding problem in your tensorflow version, the training problem in pytorch version is that all the eigenvalues converge to 1,and the trace ratio matrix converges to identity matrix. I found the reason for this phenomenon is the formula:Sw+lambda*I. But when I remove this line code, there comes out singular matrix problem during training. Do you have any thoughts over this problem?
the Lambda*I is here to prevent det(SW) to become =0 and prevent inv. It should work better with it. But I have no idea why it doesn't work. I had reported the error on tensorflow, never got an answer from the team. And it is a very important feature.
2018-05-26 13:08 GMT+00:00 zorrocai notifications@github.com:
@tchaton https://github.com/tchaton I have written the deeplda in pytorch.But it seems that it doesn't work well too. Insteading of the loss exploding problem in your tensorflow version, the training problem in pytorch version is that all the eigenvalues converge to 1,and the trace ratio matrix converges to identity matrix. I found the reason for this phenomenon is the formula:Sw+lambda*I. But when I remove this line code, there comes out singular matrix problem during training. Do you have any thoughts over this problem?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392260342, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHQgRUVVU8EVeUvtjMek991P6GLeiks5t2VPRgaJpZM4S-TPP .
But yet, by looking at the paper, this code doesn t solve the correct equation ...
Thanks. there are so many hard obstacles in the deeplda model...
Can you send me the pytorch code ?
2018-05-26 13:53 GMT+00:00 zorrocai notifications@github.com:
Thanks. there are so many hard obstacles in the deeplda model...
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392263009, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHTGZLIFSuezkXNdGLxsSNn60EiV4ks5t2V5KgaJpZM4S-TPP .
@tchaton Of course, sending pytorch code to your email Thomas.Chaton@uk.fujitsu.com.
Thanks you !
2018-05-27 3:20 GMT+00:00 zorrocai notifications@github.com:
@tchaton https://github.com/tchaton Of course, sending pytorch code to your email Thomas.Chaton@uk.fujitsu.com.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392302511, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHR3xBo_n1QrOx24w7BLJTdttU3Ywks5t2huIgaJpZM4S-TPP .
Hi,
this is Matthias from the theano DeepLDA repo. I just wanted to comment on some of your comments.
@tchaton
evals_t = T.slinalg.eigvalsh(Sb_t, St_t) where it should be evals_t = T.slinalg.eigvalsh(Sb_t, Sw_t) to follow the equation. And in the original code from scipy.linalg.decomp import eigh evals, evecs = eigh(Sb, Sw)
Both versions should be fine. Actually, the two formulations should be equivalent. You can find a reference on this in Section 2 of this paper: http://mi.eng.cam.ac.uk/~cipolla/publications/inproceedings/2007-CVPR-Kim-incremental.pdf
I also just checked the two implementations on mnist and both worked in my theano implementation.
@RoyiAvital
Did you verify results of the Auto Grad are like in the appendix of the paper?
When we did our theano implementation and experiments we also relied on autograd. For the paper (appendix) we wanted to also provide a sketch on how to get to the gradients of the DeepLDA objective. The actual gradients of the generalized eigenvalues that apply in DeepLDA can be found in theano#s EigvalshGrad operator: https://github.com/Theano/theano/blob/d395439aec5a6ddde8ef5c266fd976412a5c5695/theano/tensor/slinalg.py#L385-L440
This should be in line with what is provided in Section 2.1 Equation (4) (Generalized Eigen Value Problems) of De Leeuw 2007 (http://gifi.stat.ucla.edu/speed/deleeuw_R_07c.pdf)
Regarding the gradients (and numerical stability) in PyTorch and Tensorflow I unfortunately can't comment. As I said, when we first tried the idea with theano everything worked fine for us from the beginning so we did not investigate this in detail.
Hello,
wow you did a great job. Did you try to check the part of my code where I am using cholesky decomposition to simplify the equation. I was also thinking to use Penrose_inverse in the case the matrix is not inversible for some reasons https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_inverse Do you have any ideas about those ?
Best, Tom
2018-05-29 12:41 GMT+00:00 zorrocai notifications@github.com:
@tchaton https://github.com/tchaton @dmatte https://github.com/dmatte I think the problem that caused the loss exploding problem in your tensorflow version or the poor loss in pytorch version may be the different eigenvalue decomposition methods. there are two basic different eigen decompostion methods in both scipy and numpy: eig eigenvalues and right eigenvectors for non-symmetric arrays eigh Eigenvalues and right eigenvectors for symmetric/Hermitian arrays. and they don't get the same eigenvalues, scipy's eig and eigh are totally different. numpy's eig and eigh return the same eigenvalues in different order. But numpy's eigh returns the different eigenvalues from scipy's eigh. Does this mean A\vec{e{i}} = v{i} B\vec{e{i}} has the different eignevalues with \frac{A}{B}\vec{e{i}} = v{i}\vec{e{i}}? What's more I compared the torch.eig with above methods, torch.eig have the same eigenvalues as numpy.eig. Though s_b/s_w is symmetric, torch.svd retruns singular values with minus on some eigenvalues. like this:
import numpy as np import scipy import torch a = np.random.rand(1000,10) b = np.random.rand(1000,10) s_w = np.cov(a.T) s_b = np.cov(b.T) from scipy.linalg.decomp import eigh evals, evecs = eigh(s_b, s_w) evals, evecs = scipy.linalg.eigh(s_b, s_w) evals array([ 0.82751224, 0.8575706 , 0.90092649, 0.93457503, 0.96302211, 1.00898617, 1.02661205, 1.10996722, 1.14103357, 1.26663491]) evals, evecs = scipy.linalg.eig(s_b, s_w) evals array([ 1.26663491+0.j, 1.14103357+0.j, 1.10996722+0.j, 0.82751224+0.j, 0.85757060+0.j, 0.90092649+0.j, 0.93457503+0.j, 0.96302211+0.j, 1.02661205+0.j, 1.00898617+0.j]) evals, evecs = np.linalg.eigh(s_b/s_w) evals array([-14.82691125, -9.93629821, -3.0339017 , -0.99347968, -0.23726914, 1.46624221, 2.76740404, 4.01231621, 12.13940769, 18.58078932]) evals, evecs = np.linalg.eig(s_b/s_w) evals array([ 18.58078932, -14.82691125, 12.13940769, -9.93629821, -3.0339017 , 4.01231621, 2.76740404, -0.99347968, -0.23726914, 1.46624221]) s_w = torch.from_numpy(s_w) s_b = torch.from_numpy(s_b) evals,evecs = torch.eig(s_b/s_w) evals tensor([[ 18.5808, 0.0000], [-14.8269, 0.0000], [ 12.1394, 0.0000], [ -9.9363, 0.0000], [ -3.0339, 0.0000], [ 4.0123, 0.0000], [ 2.7674, 0.0000], [ -0.9935, 0.0000], [ -0.2373, 0.0000], [ 1.4662, 0.0000]], dtype=torch.float64)
u,s,v= torch.svd(s_b/s_w) s tensor([ 18.5808, 14.8269, 12.1394, 9.9363, 4.0123, 3.0339, 2.7674, 1.4662, 0.9935, 0.2373], dtype=torch.float64) torch.equal((s_b/s_w).t(),s_b/s_w) True
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392761629, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHdljnNsC55Wy14YzfrItnxVXcCT6ks5t3UIFgaJpZM4S-TPP .
Hello,
I have found this very interesting code : https://github.com/roatienza/Deep-Learning-Experiments/blob/master/Experiments/Tensorflow/Math/decomposition.py
Make sure to look at it :)
Best, Tom
2018-06-05 21:34 GMT+00:00 thomas chaton thomas.chaton.ai@gmail.com:
Hello,
wow you did a great job. Did you try to check the part of my code where I am using cholesky decomposition to simplify the equation. I was also thinking to use Penrose_inverse in the case the matrix is not inversible for some reasons https://en.wikipedia.org/wiki/ Moore%E2%80%93Penrose_inverse Do you have any ideas about those ?
Best, Tom
2018-05-29 12:41 GMT+00:00 zorrocai notifications@github.com:
@tchaton https://github.com/tchaton @dmatte https://github.com/dmatte I think the problem that caused the loss exploding problem in your tensorflow version or the poor loss in pytorch version may be the different eigenvalue decomposition methods. there are two basic different eigen decompostion methods in both scipy and numpy: eig eigenvalues and right eigenvectors for non-symmetric arrays eigh Eigenvalues and right eigenvectors for symmetric/Hermitian arrays. and they don't get the same eigenvalues, scipy's eig and eigh are totally different. numpy's eig and eigh return the same eigenvalues in different order. But numpy's eigh returns the different eigenvalues from scipy's eigh. Does this mean A\vec{e{i}} = v{i} B\vec{e{i}} has the different eignevalues with \frac{A}{B}\vec{e{i}} = v{i}\vec{e{i}}? What's more I compared the torch.eig with above methods, torch.eig have the same eigenvalues as numpy.eig. Though s_b/s_w is symmetric, torch.svd retruns singular values with minus on some eigenvalues. like this:
import numpy as np import scipy import torch a = np.random.rand(1000,10) b = np.random.rand(1000,10) s_w = np.cov(a.T) s_b = np.cov(b.T) from scipy.linalg.decomp import eigh evals, evecs = eigh(s_b, s_w) evals, evecs = scipy.linalg.eigh(s_b, s_w) evals array([ 0.82751224, 0.8575706 , 0.90092649, 0.93457503, 0.96302211, 1.00898617, 1.02661205, 1.10996722, 1.14103357, 1.26663491]) evals, evecs = scipy.linalg.eig(s_b, s_w) evals array([ 1.26663491+0.j, 1.14103357+0.j, 1.10996722+0.j, 0.82751224+0.j, 0.85757060+0.j, 0.90092649+0.j, 0.93457503+0.j, 0.96302211+0.j, 1.02661205+0.j, 1.00898617+0.j]) evals, evecs = np.linalg.eigh(s_b/s_w) evals array([-14.82691125, -9.93629821, -3.0339017 , -0.99347968, -0.23726914, 1.46624221, 2.76740404, 4.01231621, 12.13940769, 18.58078932]) evals, evecs = np.linalg.eig(s_b/s_w) evals array([ 18.58078932, -14.82691125, 12.13940769, -9.93629821, -3.0339017 , 4.01231621, 2.76740404, -0.99347968, -0.23726914, 1.46624221]) s_w = torch.from_numpy(s_w) s_b = torch.from_numpy(s_b) evals,evecs = torch.eig(s_b/s_w) evals tensor([[ 18.5808, 0.0000], [-14.8269, 0.0000], [ 12.1394, 0.0000], [ -9.9363, 0.0000], [ -3.0339, 0.0000], [ 4.0123, 0.0000], [ 2.7674, 0.0000], [ -0.9935, 0.0000], [ -0.2373, 0.0000], [ 1.4662, 0.0000]], dtype=torch.float64)
u,s,v= torch.svd(s_b/s_w) s tensor([ 18.5808, 14.8269, 12.1394, 9.9363, 4.0123, 3.0339, 2.7674, 1.4662, 0.9935, 0.2373], dtype=torch.float64) torch.equal((s_b/s_w).t(),s_b/s_w) True
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392761629, or mute the thread https://github.com/notifications/unsubscribe-auth/AMRCHdljnNsC55Wy14YzfrItnxVXcCT6ks5t3UIFgaJpZM4S-TPP .
Hi, @tchaton Yeah, for symmetric matrix,eigenvalues are the same as singular values. you can see it in matrix cookbook for detail. https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf And I found that I have made a mistake in the equation it should the same as , NOT . Cause that . What is more, though Sw(within scatter matrix) and Sb(between scatter matrix) are both symmetric matrixes, but is not symmetric.
Best, Zorro
@zorrocai ,
Singular Values are the Eigen Values of $ {A}^{T} A $ or $ A {A}^{T} $ (Which share the same eigen values). Hence the connection between the two is usually $ \sigma {\left( A \right)}{i} = \lambda {\left( A \right)}{i}^{2} $.
@RoyiAvital Thanks
Hi @zorrocai I am also attempting to work on the deep-lda generalized eigenvalue problem loss in pytorch so I was wondering if you might be willing to share your code as well with me. My email is nikparth@gmail.com. Thanks!
Hello, Has anyone had any luck with porting the code to PyTorch?
@zorrocai hello, did you succeed in implementing deeplda on Pytorch, the ldaloss in my code doesn't work.......
Hello,
I am converting this code to tensorflow even if it is a little bit complicated. While studying the code, I found that you are solving evals_t = T.slinalg.eigvalsh(Sb_t, St_t) where it should be evals_t = T.slinalg.eigvalsh(Sb_t, Sw_t) to follow the equation. And in the original code from scipy.linalg.decomp import eigh evals, evecs = eigh(Sb, Sw) My email adress is Thomas.Chaton@uk.fujitsu.com is you want to talk about it.