NOAA-GFDL / CEFI-regional-MOM6

A repository containing essential tools, XML files, and source codes for collaborators of the Climate, Ecosystems, and Fisheries Initiative (CEFI) to conduct simulations.
Other
19 stars 16 forks source link

Diagnostic scripts producing inconsistent output #87

Closed uwagura closed 2 months ago

uwagura commented 2 months ago

Several scripts within the diagnostics/physics subfolder have been producing output that is inconsistent between runs. When I originally ran sss_eval.py on 07/29/2024, I produced the following figure:

sss_eval

However, all recent runs (i.e from at least 08/29/2024 to present) produce the following output:

sss_eval_config

Note that the skill metrics are slightly different, and that the model - oisst,model -glorys, and model plots are all slightly different between these two runs.

Andrew previously ran the same script, which was published here, and produced a figure with a slightly different glorys plot and slightly different skill metrics for model-glorys compared to both of the above runs

figure04

A similar issue occurs with the sst_eval script. A run that took place around 07/29/2024 produced the following output:

sst_eval

while a run from a few days later ( 08/01/2024 ) produced the following figure: sst_eval

The august runs of these plots were likely produced using Yi-Cheng's setup conda environment, which hasn't been modified recently. The july runs most likely used the same environment as well. Likewise, all of these runs used the same history files (currently listed at the top of each of these scripts), and these files have not been altered since the initial run.

It's unclear what's causing the discrepency. MSD thinks the analysis nodes having slightly different firmware ( some nodes use avx , while others don't ) may have contributed to this issue, or perhaps some other subtle hardware difference that python doesn't like. In my testing, I did not notice any difference between the nodes with different versions of avx ( listed here).

Regardless, if you run the diagnostic scripts and receive two different plots, please respond with 1.) the path to the python environment where you ran the script and 2.) The analysis node you ran the script on so that we can look into both hardware and software issues when debugging this.

andrew-c-ross commented 2 months ago

It's interesting that in both cases, the more recent version of the plots is closer to what is in the original published paper. This would suggest there was something weird in July that has been fixed, rather than a problem that started in August.

uwagura commented 2 months ago

@andrew-c-ross ,

(UPDATE: Turns out the issue with these plots was human error :). I am still looking into the issue with the glorys plots between the paper and the current python environment, but for the most part this issue is resolved. A fuller update is incoming, I'm just updating this comment to avoid spamming people's inboxes)

Yeah, it does seem like the results from July were the anomaly across most of the plots. The only exception I see so far is the mld003 plot. It's possible that I ran those figures in some strange python environment that I've since deleted, containing packages that have since been updated. It's still weird though that the plots from paper are still slightly different from the plots from right now.

mean_ssh_eval 07/29/2024 mean_ssh_eval

mean_ssh_eval 09/12/2024 mean_ssh_eval

sst_trends 07/29/2024: sst_trends

sst_trends 09/12/2024: sst_trends

mld003_eval 07/29/2024 mld003_eval

mld003_eval 09/12/2024 mld003_eval

uwagura commented 2 months ago

Turns out I was mistaken: All of the July plots shown in this issue were actually produced with history data stored in my own archive, which was only available for about 10 years, and not the 27 year history data from the paper that produced all of the August plots. I guess opening and investigation this issue was nothing more than a lesson in tracking how I produce plots a bit more carefully.

As for the discrepancy between the plots from the paper and the August plots: It looks like the plots published in the paper only used glorys data from 1993-2019, while the glorys data in /work/acr/mom6/diagnostics/glorys/glorys_sfc.nc stretches from 1993 to 2023-05. I can reproduce the plots in the paper by cutting off the glorys dataset in 2019, so this was also just a dataset issue.

All told, this issue is completely resolved.