gundam-organization / gundam

GUNDAM, for Generalized and Unified Neutrino Data Analysis Methods, is a suite of applications which aims at performing various statistical analysis with different purposes and setups.
GNU Lesser General Public License v2.1
13 stars 10 forks source link

NaN value issue on LTS/1.8.x #540

Closed LenaOsu closed 2 weeks ago

LenaOsu commented 3 weeks ago

When using external mirroring for cross-section splines (eigen decomposition for Flux,xsec FSI and ND cov. matrix, NO scan option), and already commented L214 (all the paragraph) in JointProbability.cpp and L91 in Parameter.cpp (because they lead to NaN issues too), it still remains an error that makes crash the Data fit : LogThrowIf( std::isnan(parameterValue), "Attempting to set NaN value for par:" << std::endl << this->getSummary() The " this->getSummary() " does not print anything, it crashes before. This happens during MIGRAD process, cannot reach Hesse.

ClarkMcGrew commented 3 weeks ago

The throw is "correct" since the parameter value should not be a NAN, so the question is "why is it a NaN?" Are the input files you're using already committed in GundamOA2024? And what is the command line you're running to prompt the error (i.e. the overrides, options, etc). I've got some chores to finish since I'm doing a system upgrade, but I'll try to take a look.

ClarkMcGrew commented 3 weeks ago

Did I understand correctly that you needed to comment out Parameter.cpp L91 to stop a crash. That would mean that Parameter::setParameter is being called with a NAN, which is a straight out bug.

Similarly, JointProbability.cpp L214 should probably be proceeded by a LogThrowIf(std::isnan(chiSq)) since that should never occur for correct inputs. An infinite chisq can be valid, but during a minuit fit tends to indicate there is a problem with the inputs.

Both of those make it sound like there is a problem with the inputs. Have you found anything new?

LenaOsu commented 3 weeks ago

Hi Clark,

Yes, you understood well ! It seems that the only parameters that changes a lot using the GUNDAM internal mirroring or the mirroring made from OAGenWeightsApps is R_S_Delta_Decay. I'll run a fit without enabling it and see if it crashes or not. I did not get this issue when I used the GUNDAM internal mirroring, I will also check splines by event in the new data inputs files. But yes, event if I mute L214 in JointProbability.cpp and L91 in Parameter.cpp i still get a NaN value error that make crash the fit before to reach Hesse.. Thanks a lot Clark, I will try to clarify the potential input issue, Léna


De : Clark McGrew @.> Envoyé : vendredi 28 juin 2024 05:22 À : gundam-organization/gundam @.> Cc : LenaOsu @.>; Author @.> Objet : Re: [gundam-organization/gundam] NaN value issue on LTS/1.8.x (Issue #540)

Did I understand correctly that you needed to comment out Parameter.cpp L91 to stop a crash. That would mean that Parameter::setParameter is being called with a NAN, which is a straight out bug.

Similarly, JointProbability.cpp L214 should probably be proceeded by a LogThrowIf(std::isnan(chiSq)) since that should never occur for correct inputs. An infinite chisq can be valid, but during a minuit fit tends to indicate there is a problem with the inputs.

Both of those make it sound like there is a problem with the inputs. Have you found anything new?

— Reply to this email directly, view it on GitHubhttps://github.com/gundam-organization/gundam/issues/540#issuecomment-2196033552, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3SDITCCKLTDLU3UUYNEVHTZJTJIZAVCNFSM6AAAAABJ6BEPU6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJWGAZTGNJVGI. You are receiving this because you authored the thread.Message ID: @.***>

LenaOsu commented 3 weeks ago

I mean, MC inputs files* Léna


De : Léna Osu @.> Envoyé : vendredi 28 juin 2024 11:50 À : gundam-organization/gundam @.> Objet : RE: [gundam-organization/gundam] NaN value issue on LTS/1.8.x (Issue #540)

Hi Clark,

Yes, you understood well ! It seems that the only parameters that changes a lot using the GUNDAM internal mirroring or the mirroring made from OAGenWeightsApps is R_S_Delta_Decay. I'll run a fit without enabling it and see if it crashes or not. I did not get this issue when I used the GUNDAM internal mirroring, I will also check splines by event in the new data inputs files. But yes, event if I mute L214 in JointProbability.cpp and L91 in Parameter.cpp i still get a NaN value error that make crash the fit before to reach Hesse.. Thanks a lot Clark, I will try to clarify the potential input issue, Léna


De : Clark McGrew @.> Envoyé : vendredi 28 juin 2024 05:22 À : gundam-organization/gundam @.> Cc : LenaOsu @.>; Author @.> Objet : Re: [gundam-organization/gundam] NaN value issue on LTS/1.8.x (Issue #540)

Did I understand correctly that you needed to comment out Parameter.cpp L91 to stop a crash. That would mean that Parameter::setParameter is being called with a NAN, which is a straight out bug.

Similarly, JointProbability.cpp L214 should probably be proceeded by a LogThrowIf(std::isnan(chiSq)) since that should never occur for correct inputs. An infinite chisq can be valid, but during a minuit fit tends to indicate there is a problem with the inputs.

Both of those make it sound like there is a problem with the inputs. Have you found anything new?

— Reply to this email directly, view it on GitHubhttps://github.com/gundam-organization/gundam/issues/540#issuecomment-2196033552, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3SDITCCKLTDLU3UUYNEVHTZJTJIZAVCNFSM6AAAAABJ6BEPU6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJWGAZTGNJVGI. You are receiving this because you authored the thread.Message ID: @.***>

ClarkMcGrew commented 3 weeks ago

The NaN Throw at Parameter L91 already says things are hopeless. It should never be called with a NaN, so removing it won't help and it's probably catching the first place the problem is seen. My guess is that with external mirroring the fit is moving past the end of the mirrored spline and we are getting a NaN.

The JointProbability code should distinguish between NaN (always a bug), and infinity (a possible valid response), but isfinite traps both. I'll try to patch so that NAN is a throw, and "not isfinite" is a warning.

ClarkMcGrew commented 2 weeks ago

Closed by #543