Closed Zeff020 closed 1 year ago
I briefly tried looking into this. I put some prints in FlexibleInterpVar::evaluate
just to see the values. Without having any idea what FlexibleInterpVar
should be doing because documentation is lacking, the first thing that strikes me as odd is that, in the evaluation that gives a NaN, the "low" value is higher than the "high" value. Is this expected?
I also tried a partial "revert" (manually) of earlier commit https://github.com/root-project/root/commit/466f3f689c578cb53d75ddeeb04472ec4d82e3ed which changed the 6th order interpolation scheme (which is used in this failing example), uncommenting the commented out code here (and commenting the uncommented code):
inline double interpolate6thDegree(double x, double low, double high, double nominal, double boundary)
{
double t = x / boundary;
double eps_plus = high - nominal;
double eps_minus = nominal - low;
double S = 0.5 * (eps_plus + eps_minus);
double A = 0.0625 * (eps_plus - eps_minus);
return x * (S + t * A * (15 + t * t * (-10 + t * t * 3)));
// double pow_up = std::pow(high / nominal, boundary);
// double pow_down = std::pow(low / nominal, boundary);
// double logHi = std::log(high);
// double logLo = std::log(low);
// double pow_up_log = high <= 0.0 ? 0.0 : pow_up * logHi;
// double pow_down_log = low <= 0.0 ? 0.0 : -pow_down * logLo;
// double pow_up_log2 = high <= 0.0 ? 0.0 : pow_up_log * logHi;
// double pow_down_log2 = low <= 0.0 ? 0.0 : -pow_down_log * logLo;
//
// double S0 = (pow_up + pow_down) / 2;
// double A0 = (pow_up - pow_down) / 2;
// double S1 = (pow_up_log + pow_down_log) / 2;
// double A1 = (pow_up_log - pow_down_log) / 2;
// double S2 = (pow_up_log2 + pow_down_log2) / 2;
// double A2 = (pow_up_log2 - pow_down_log2) / 2;
//
// // cache coefficient of the polynomial
// double p0 = 1. / (8 * boundary) * (15 * A0 - 7 * boundary * S1 + boundary * boundary * A2);
// double p1 = 1. / (8 * boundary * boundary) * (-24 + 24 * S0 - 9 * boundary * A1 + boundary * boundary * S2);
// double p2 = 1. / (4 * std::pow(boundary, 3)) * (-5 * A0 + 5 * boundary * S1 - boundary * boundary * A2);
// double p3 = 1. / (4 * std::pow(boundary, 4)) * (12 - 12 * S0 + 7 * boundary * A1 - boundary * boundary * S2);
// double p4 = 1. / (8 * std::pow(boundary, 5)) * (+3 * A0 - 3 * boundary * S1 + boundary * boundary * A2);
// double p5 = 1. / (8 * std::pow(boundary, 6)) * (-8 + 8 * S0 - 5 * boundary * A1 + boundary * boundary * S2);
//
// return 1. + x * (p0 + x * (p1 + x * (p2 + x * (p3 + x * (p4 + x * p5)))));
}
This still gives a NaN for this workspace, though, so this doesn't seem to be the issue.
Oops, the above wasn't enough for the "revert", I also had to replace the calling site (in flexibleInterp
) with
// return total * std::exp(interpolate6thDegree(x, std::log(low), std::log(high), std::log(nominal), boundary));
return total * interpolate6thDegree(x, low, high, nominal, boundary);
This (together with using the old code as commented out above) indeed gets rid of the nan
!
I think the crucial part is that the old code checks for high <= 0
and also low <= 0
.
Indeed, I traced the nan
there: the sequence of events in this failing example is that high
is zero, so that std::log(high)
is -inf
. Then eps_plus
is also -inf
and so S
and A
too, so you -inf
times something negative (t
happens to be negative in this example) so the term becomes +inf
and eventually you get a -inf + inf
and that equals nan
.
So, there's at least two options to fix this, both will need some conditionals added in again.
<= 0
checks to the new algorithm. I think this should take place at the call site, because it doesn't make sense for the other interpolate6thDegree
usecase where no weird values are expected from logarithms.By the way, why is the interpolation taking place in logarithmic space instead of regular linear space and then transformed back into linear space after interpolation? Doesn't this also mean that the two interpolation schemes (FlexibleInterpVar and PiecewiseInterpolation) are again inconsistent? If the logarithm can be eliminated, I think we can do without introducing conditionals, no? That is, assuming the high and low vectors are filled sensibly, which is not checked anywhere as far as I can see, so that's still a minor weakness, but I guess that could easily be added in the documentation.
I tried out option 3 and it gives pretty good results, using your comparison method as documented in your commit message here https://github.com/root-project/root/commit/466f3f689c578cb53d75ddeeb04472ec4d82e3ed:
value_old value_new diff_abs diff_rel
param
-1.00 0.9000000000 0.9000000000 0.0000000000 0.00E+00
-0.95 0.9047424331 0.9049857073 0.0002432742 2.69E-04
-0.85 0.9140836603 0.9146805378 0.0005968775 6.53E-04
-0.75 0.9230659064 0.9237960815 0.0007301751 7.91E-04
-0.65 0.9317147727 0.9323668386 0.0006520658 7.00E-04
-0.55 0.9402402768 0.9406592464 0.0004189696 4.46E-04
-0.45 0.9489635592 0.9490771800 0.0001136208 1.20E-04
-0.35 0.9582539443 0.9580809519 -0.0001729925 -1.81E-04
-0.25 0.9684763536 0.9681198120 -0.0003565416 -3.68E-04
-0.15 0.9799490728 0.9795779479 -0.0003711248 -3.79E-04
-0.05 0.9929118716 0.9927339847 -0.0001778869 -1.79E-04
0.05 1.0075044778 1.0077339847 0.0002295068 2.28E-04
0.15 1.0237554038 1.0245779479 0.0008225442 8.03E-04
0.25 1.0415811263 1.0431198120 0.0015386857 1.48E-03
0.35 1.0607956204 1.0630809519 0.0022853314 2.15E-03
0.45 1.0811302458 1.0840771800 0.0029469342 2.73E-03
0.55 1.1022639865 1.1056592464 0.0033952598 3.08E-03
0.65 1.1238640449 1.1273668386 0.0035027937 3.12E-03
0.75 1.1456367871 1.1487960815 0.0031592944 2.76E-03
0.85 1.1673890440 1.1696805378 0.0022914938 1.96E-03
0.95 1.1890997634 1.1899857073 0.0008859439 7.45E-04
1.00 1.2000000000 1.2000000000 0.0000000000 0.00E+00
The difference is slightly worse, half an order of magnitude I'd guesstimate, but perhaps this is not such a big deal.
The implementation that produced these "new" results:
interpolate6thDegree
.flexibleInterp
, I removed the log
s, the exp
and I added a 1 +
, which is also done in the old algorithm before returning. Indeed, I saw that when I did not add it, the absolute difference of the first results was exactly -1
. So, in the end the calling line becomes return total * (1 + interpolate6thDegree(x, low, high, nominal, boundary));
.No conditionals needed and the results match the old ones pretty well still.
Thanks a lot for the investigation! Option 3 in your recent comment sounds like the ideas solution to me! And it indeed fixes your problem with the ATLAS workspaces? Can you make a PR with that? I think it's the good solution without reverting https://github.com/root-project/root/pull/13067.
Yes, I'll make a PR 👍
In the end, this problem is fixed by the following PR:
The PR restores the old behavior for now. All these striking inconsistencies will be fixed in a follow-up PR.
Check duplicate issues.
Description
Since recently we have been seeing that on ROOT master a simple workspace with a single
RooStats::HistFactory::FlexibleInterpVar
component fails already just on evaluation, whereas it works in ROOT 6.28.04. The workspace is attached, and the code to reproduce the error included in the reproducer.Reproducer
root -l failWS_release.root
fail->Print()
Please find here the workspace: failWS_release.zip
ROOT version
Current ROOT master gives the error. ROOT 6.28.04 does not.
Installation method
build from source
Operating system
Linux centos7
Additional context
The PR that introduced the problem: https://github.com/root-project/root/pull/13067