Open mathomp4 opened 4 months ago
Confirmed that 27d47d4a3b4b9e2f426186a02a74cc4649432f43 works but 4486933fe6e7f0fcf7a122da7d65e11b650d87f0 does not. So something between causes it:
but that is (essentially) #2838 via #2888. And I'm not using any of the @metdyn code! Aaaa!
I'm adding @tclune to this because I am confused.
Indeed, as @atrayano saw, I can run this code with History OFF and it fails. And yet all the changes in 4486933fe6e7f0fcf7a122da7d65e11b650d87f0 were in History! Aaaaa!
Note: if you turn off ExtData, it does run. So it seems like ExtData is the issue...but then this:
has no changes!
New update! If I build MAPL3 GEOSgcm with GNU and use my Aggressive flags, it works! This is really looking like one of those "memory got mooshed around" sort of things (like GNU + MOM6 which randomly works then fails then works...)
Well, I tried GNU but where Release uses -O2 instead of -O3 but that still fails. So huh.
The regular release flags are (excluding flags common to Release and Aggressive):
Fortran_FLAGS = -O3 -march=znver2 -mtune=generic -funroll-loops -ffpe-trap=zero,overflow
and the aggressive are:
Fortran_FLAGS = -O2 -march=native -ffast-math -ftree-vectorize -funroll-loops --param max-unroll-times=4 -mno-fma
So, I guess maybe I'll try a run without ffpe-trap?
ETA: Didn't help. 😞
I tried running ExtDataDriver.x from the model build with release/MAPL-v3 in my "simulate gocart" mode. I.E. run with the same inputs to extdata the real model uses. Ran fine, definitely seems like a "memory got smooshed" issue.
Dang. I might need to just fiddle with the flags in various places. I guess the ExtData gridcomp is the place to start
My nightly tests have shown that GEOSgcm running MAPL3 with GNU and Release crashes at the end of ExtData:
I looked back and this was working as of June 23, failing on June 24.
Not much has gone into MAPL3 since then, mainly stuff from @metdyn ... but I'm not exercising that!