NOAA-CEFI-Regional-Ocean-Modeling / ocean_BGC

3 stars 8 forks source link

Add support for a COBALT_param_doc file #59 failed the NWA12-COBALT testing #64

Closed yichengt900 closed 5 months ago

yichengt900 commented 5 months ago

Although @theresa-morrison 's #59 passed our 1D case testing, it failed the NWA12-COBALT testing (answers are not identical to the baselines). Interestingly, the answers are consistent with the baselinese until we introduced the Use scale argument in get_param commit. We'll need to re-check commit 768da600 to ensure it works as expected.

theresa-morrison commented 5 months ago

commit 768da600 made some parameters too long to fit on a single line, breaking up the lines could have introduced some error.

PR59 also changed the default of do_case2_mod from true to false. It should be true to get the previous NWA12 solutions.

Can you point me to the out put files? Looking at which variables diverge first many give a hint to which parameter has the error.

yichengt900 commented 5 months ago

@theresa-morrison , thanks. I noticed do_case2_mod change and it has been added to all the NWA12 tests I conducted so far. I was able to get b2b answers in your early commits. I suspect it might be something to do with the scale option provided by get_param.

You can find the testing folders here:

/gpfs/f5/cefi/scratch/Yi-cheng.Teng/develop/20240525/CEFI-regional-MOM6/exps

You can compare the RESTART_48hrs/ocean.stats from test3 (commit https://github.com/NOAA-CEFI-Regional-Ocean-Modeling/ocean_BGC/commit/768da600361def64f84b755efb671e0471aa3bcf) and the one from test2 (commit https://github.com/NOAA-CEFI-Regional-Ocean-Modeling/ocean_BGC/commit/53e04e3126526832b8d64469a5cdb5c353710f14). The test_org contains results before PR #59.

theresa-morrison commented 5 months ago

Thanks, I will take a look. Could it be because we are now multiplying by 1.0/spred instead of dividing by sperd? I think 1.0/spred should be exactly represented but I am not sure.

yichengt900 commented 5 months ago

Thanks @theresa-morrison . It should be the same but I will give it a try with -O0 flag and check the results again.

theresa-morrison commented 5 months ago

All the zooplankton swimming parameter names all the same in this commit. I am not sure if the swim speed is actually used and would be in the same in the previous commit.

yichengt900 commented 5 months ago

Good catch. My understanding is that zooplankton swimming is not really activated in our current simulation, but I can double-check to ensure that is the case.

theresa-morrison commented 5 months ago

I agree, that is what I thought as well.

Should I create a new commit that rolls back some of these rescaling changes?

yichengt900 commented 5 months ago

Yup we probably need a new commit to fix this issue. Let me do some further investigations first to ensure which specific rescaling changes are causing the problem.

charliestock commented 5 months ago

Just wanted to confirm that the swimming parameters should be inactive and so unlikely to cause any issues.

yichengt900 commented 5 months ago

OK, although value/spred and (1.0/spred)*value are mathematically equivalent and yield the same results in single precision, this is not the case with double precision (i.e.., the -r8 compiler option is one of the default flags in the current fremake). Floating-point operations can introduce small discrepancies due to rounding errors in double precision. I think we have two solutions here:

  1. Roll back the rescaling changes. This ensures the results are bit-for-bit identical (b2b), but we will need extra lines to handle parameter rescaling.

  2. Keep the current changes. This means fully utilizing the MOM get_param and the scale argument. However, this introduces small discrepancies, and we will need to conduct long-term MOM6-COBALT coupled runs for all the domains we are testing now to ensure the modeled results do not drift away.

Personally, I prefer the second option, but as I mentioned, we will need to conduct long-term experiments first to ensure we are satisfied with the results.

charliestock commented 5 months ago

Thanks Yi-Cheng,

I lean toward the second option as well. I think we can afford a few extra runs to test things thoroughly.

Would it be useful/possible run paired simulation in single precision?

yichengt900 commented 5 months ago

@charliestock, thanks. Sure, I have submitted two runs with and without the new changes. I will also submit an additional test using single precisionIt turns out that building the coupled system with single precision is not that straightforward. I will put it on hold for now. I'll keep you posted.

yichengt900 commented 5 months ago

This issue has been resolved. See details here.