Closed hroest closed 10 years ago
mac osx shows the same problem with the GammaDistributionFitter_test (eigen 3.2.0 contrib vs eigen 3.2.1 homebrew)
After some debugging it seems that some changes between eigen 3.2.0 and 3.2.1 have affected parts of the the LM used in the GammaDistributionFitter without actually touching it. The weird thing is, that the GammaDistributionFitter actually produces a lot of NaNs while fitting the test data. So not sure if this is even valid what we are trying to do. I will try to get some more details ..
Did.we consider that our test.data may be unsuitable? Maybe also check our testdata and whether it really is testing what we think it is. I don't really understand why all other tests involving the fitting work except the unit test. specifically what happens when a larger dataset is used?
3.2.1 was supposed to be bugfix only, right?
Hannes
sent from my smartphone -- you may keep any typos you find. On Jun 11, 2014 4:39 PM, "Stephan Aiche" notifications@github.com wrote:
After some debugging it seems that some changes between eigen 3.2.0 and 3.2.1 have affected parts of the the LM used in the GammaDistributionFitter without actually touching it. The weird thing is, that the GammaDistributionFitter actually produces a lot of NaNs while fitting the test data. So not sure if this is even valid what we are trying to do. I will try to get some more details ..
— Reply to this email directly or view it on GitHub.
test data isn't optimal but should work, the problem is that the gamma distribution seems to be only defined for positive parameters but the optimisation temporarily switches to negative values which leads to a more or less undefined state
@aiche: why are there so many points at zero? the fit does not seem to be perfect, is it due to these extra values?
why are there so many points at zero?
don't know
the fit does not seem to be perfect, is it due to these extra values?
possible, given the zero values it could still be the best (in the least-squares sense) fit
can we prevent negative values in the optimization step, e.g. during evaluation of the function or computation of the gradients ?
not really, we could only abort the optimization if the values get negative. the LM implementation in eigen (and also gsl) don't support constrains on the parameters
does the fit work (at least not fail) if you delete all the points at zero?
On 12 June 2014 20:55, Stephan Aiche notifications@github.com wrote:
why are there so many points at zero?
don't know
the fit does not seem to be perfect, is it due to these extra values?
possible, given the zero values it could still be the best (in the least-squares sense) fit
can we prevent negative values in the optimization step, e.g. during evaluation of the function or computation of the gradients ?
not really, we could only abort the optimization if the values get negative. the LM implementation in eigen (and also gsl) don't support constrains on the parameters
— Reply to this email directly or view it on GitHub https://github.com/OpenMS/OpenMS/issues/861#issuecomment-45933261.
removing all the zeros still gives negative parameters .. so this won't work. Changing the initial parameters actually avoids going into the negative values, but this is only a fix for the test and not for the underlying problem
When trying out the debian system-library Eigen3 (
libeigen3-dev 3.2.1-2
) I get the following test failure (I assume that this is related to eigen since the fitter uses eigen):203 - GammaDistributionFitter_test (Failed)
the difference is quite substantial
therefore I was wondering which version of Eigen exactly we are using, it seems from here http://sourceforge.net/projects/open-ms/files/contrib/ that we are using 3.2.0 and debian is using 3.2.1. In addition there is a very interesting bugfix-patch that debian applies to Eigen which is also not present in our release (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=736985)
Next I tried a fresh download from http://eigen.tuxfamily.org/index.php?title=News:Eigen_3.2.1_released! and used all of our own contrib libraries with the same result as the debian system library. So it does not seem to be the above debian fix but rather a difference between Eigen 3.2.1 and Eigen 3.2.0
Also I wonder why none of the tests for the decoy distribution failed since they seem to use the Gamma distribution fitter internally to fit the false distribution
It seems that the result of the fit was added in 2009 to the test 2d77b9b5e5afdeb86ce47ba69dcd804968a8b895 and not changed since - however we have no way to verify which result is correct except that the previous Eigen and GSL agreed and suddenly the new, Eigen 3.2.1 do not seem to agree any more.
Can somebody
GammaDistributionFitter_test