Closed valassi closed 2 months ago
I have finally fixed also #831 FPEs for a variety of use cases. Now I am rerunning large scale tests, then this should also be ready to go.
Note, the LHE file mismatch #833 in heft_gg_bb will normally not be fixed in this MR.
I had forgotten to backport the very last fix for #831, now it's done.
@oliviermattelaer I assign this to you as reviewer too, can you please have a look?
In the meantime I will run more tests, but this time hopefully the code should no longer change. Thanks!
Merged here 'upstream/master' (with SUSY #824 and SMEFT #632)
Hi @oliviermattelaer thanks a lot for the discussion and review.
I made various changes acoording to what we discussed. Essentially the "new strategy for FPES #831" is the following:
Enable SIGFPE by default for INVALID, DIVBYZERO, OVERFLOW via feeenableexcept. These are the FPEs that I extensively fixes in the ixx/oxx functions in the last few months anyway, so I hope there are none left: and if there are, we'd better fix them. I disabled (commented out) the env variable CUDACPP_RUNTIME_ENABLEFPE, now these three SIGFPES are enabled by default.
About underflow (#831), which motivated many of the changes in this PR, in the end I decided that we should not treat it as a problem, and we just let it happen. So I removed this from the feenableexcept call below (it used to have fours FPEs, now it only has three). See also https://stackoverflow.com/questions/44308577/ieee-underflow-flag-ieee-denormal-in-fortran-77 and https://gcc.gnu.org/onlinedocs/gfortran/Debugging-Options.html.
By the way, about underflows I did some tests (committed in tools/fpe). Given FLT_MIN = 1.17549e-38, computing FLT_MIN / 10 does signal an underflow, but it prints out 1.17549e-39, not zero. I think (may be wrong) that these numbers are "denormals", with less precision than a float, but still more or less usable. So one more reason not to just flush-to-zero as I was doing.
For all four FPEs above, I added a printout at the end if they have been reported or not. Normally this will only report underflows (because the other three will crash by design with SIGFPE, instead). Indeed in heft_gg_bb tests with float/mixed precision I did see underflows.
I am now doing a few more tests and cleanup, but I think this should be ok. I will ask you to review again when I am done. Thanks
Hi @oliviermattelaer I confirm my changes are in. I re-request the review then (it should be easier now, a lot of code has disappeared!) Thanks
PS The CI succeeds. I am running a few more tests outside as usual anyway
Hi @oliviermattelaer I confirm I have finished. This is good to merge for me after your review. Thanks
For me this is good to go, maybe one detail would be to remove the special case for 'mdl_Gexp2' and similar but that's ok if you want to keep it.
Hi Olivier, thanks a lot.
I had a look at the 'mdl_Gexp2' and 'mdl_Gexp3' issue. It seems that your patch fixed 'mdl_Gexp3' but NOT 'mdl_Gexp2'. I will add a comment in https://github.com/mg5amcnlo/mg5amcnlo/issues/89.
So anyway here I have removed my cudacpp patch for G^3 but kept it for G^2. I regenerated all code and am realunching the CI. I will merge when the CI is ok.
I will reopen https://github.com/mg5amcnlo/mg5amcnlo/issues/89 so that you can have a look. If/when you create a new patch I will test it in a separate madgraph4gpu PR.
Thanks again Andrea
There are some issues on Mac see #838, I am trying to debug them and work around them
My workaround for #838 was ok on ee_mumu. I am applying it on CODEGEN and all processes and relaunching the CI
Ok the MacOS tests succeed after disabling testmisc (workaround for #838 - real fixes will be needed in #839 and #840).
All CI tests succeed. I am merging this. Thanks again @oliviermattelaer for the review and the approval.
(I still need to add a comment on https://github.com/mg5amcnlo/mg5amcnlo/issues/89, I will do that now. I got distracted bythe MacOS issues).
This is a WIP PR that logically follows the SUSY and SMEFT PRs. In particular it includes #632 which includes #824.
The main change is that I added heft_gg_bb to the repo (and am removing heft_gg_h).
Concerning heft_gg_bb, this now includes 4 diagrams (3 SM and also 1 HEFT) after fixing #828. This essentially means adding one diagram even if it is suppressed.
Now, the fact that the contribution is suppressed is probably why I start getting FPE underflows #831. I have now also fixed this (essentially by flushing to 0 all values of jamp_sv below E-18, so that their square is still above E-36 and fits in a float.
This is still WIP as I am running a few more tests.