Closed yichengt900 closed 5 months ago
commit 768da600 made some parameters too long to fit on a single line, breaking up the lines could have introduced some error.
PR59 also changed the default of do_case2_mod from true to false. It should be true to get the previous NWA12 solutions.
Can you point me to the out put files? Looking at which variables diverge first many give a hint to which parameter has the error.
@theresa-morrison , thanks. I noticed do_case2_mod
change and it has been added to all the NWA12 tests I conducted so far. I was able to get b2b answers in your early commits. I suspect it might be something to do with the scale
option provided by get_param
.
You can find the testing folders here:
/gpfs/f5/cefi/scratch/Yi-cheng.Teng/develop/20240525/CEFI-regional-MOM6/exps
You can compare the RESTART_48hrs/ocean.stats
from test3 (commit https://github.com/NOAA-CEFI-Regional-Ocean-Modeling/ocean_BGC/commit/768da600361def64f84b755efb671e0471aa3bcf) and the one from test2 (commit https://github.com/NOAA-CEFI-Regional-Ocean-Modeling/ocean_BGC/commit/53e04e3126526832b8d64469a5cdb5c353710f14). The test_org contains results before PR #59.
Thanks, I will take a look.
Could it be because we are now multiplying by 1.0/spred
instead of dividing by sperd
? I think 1.0/spred
should be exactly represented but I am not sure.
Thanks @theresa-morrison . It should be the same but I will give it a try with -O0
flag and check the results again.
All the zooplankton swimming parameter names all the same in this commit. I am not sure if the swim speed is actually used and would be in the same in the previous commit.
Good catch. My understanding is that zooplankton swimming is not really activated in our current simulation, but I can double-check to ensure that is the case.
I agree, that is what I thought as well.
Should I create a new commit that rolls back some of these rescaling changes?
Yup we probably need a new commit to fix this issue. Let me do some further investigations first to ensure which specific rescaling changes are causing the problem.
Just wanted to confirm that the swimming parameters should be inactive and so unlikely to cause any issues.
OK, although value/spred
and (1.0/spred)*value
are mathematically equivalent and yield the same results in single precision, this is not the case with double precision (i.e.., the -r8
compiler option is one of the default flags in the current fremake
). Floating-point operations can introduce small discrepancies due to rounding errors in double precision. I think we have two solutions here:
Roll back the rescaling changes. This ensures the results are bit-for-bit identical (b2b), but we will need extra lines to handle parameter rescaling.
Keep the current changes. This means fully utilizing the MOM get_param
and the scale
argument. However, this introduces small discrepancies, and we will need to conduct long-term MOM6-COBALT coupled runs for all the domains we are testing now to ensure the modeled results do not drift away.
Personally, I prefer the second option, but as I mentioned, we will need to conduct long-term experiments first to ensure we are satisfied with the results.
Thanks Yi-Cheng,
I lean toward the second option as well. I think we can afford a few extra runs to test things thoroughly.
Would it be useful/possible run paired simulation in single precision?
@charliestock, thanks. Sure, I have submitted two runs with and without the new changes. I will also submit an additional test using single precisionIt turns out that building the coupled system with single precision is not that straightforward. I will put it on hold for now. I'll keep you posted.
This issue has been resolved. See details here.
Although @theresa-morrison 's #59 passed our 1D case testing, it failed the NWA12-COBALT testing (answers are not identical to the baselines). Interestingly, the answers are consistent with the baselinese until we introduced the Use scale argument in get_param commit. We'll need to re-check
commit 768da600
to ensure it works as expected.