Closed JianpingHuang-NOAA closed 1 year ago
@ytangnoaa
A good starting point to look at might be the environment variables you have set up. UFS had issue if an old env var was lurking around and accidentally got into the code it would yield failures or bad results/outputs.
My package HOMEaqm is: cactus:/lfs/h2/emc/global/noscrub/lin.gan/git/aqm.v7.0.71 log:/lfs/h2/emc/ptmp/lin.gan/ecflow_aqm/para/output/prod/today COM:/lfs/h2/emc/ptmp/lin.gan/ecflow_aqm/para/com/aqm/v7.0
@BrianCurtis-NOAA would you help us to exam the configuration in these two forecast DATA location: /lfs/h2/emc/ptmp/jianping.huang/emc.para/tmp/run_fcst.2023050200 - @JianpingHuang-NOAA run /lfs/h2/emc/stmp/lin.gan/aqm/ecflow_aqm/aqm_forecast_00.2023050200 - my run Thanks
I saw this issue, which looks related NEXUS process. The emission process for some species, like NO, NO2, PCE, are identical. The difference exists for CO, ALD2, AACD etc, after merging. CO etc in time-splitted emission files are same
@ytangnoaa is this tide to biogenics?
A comparison between two DATA/RESTART directory show the following file is different:
compare_ncfile.py /lfs/h2/emc/stmp/lin.gan/aqm/ecflow_aqm/aqm_forecast_00.2023050200/RESTART/fv_tracer.res.tile1.nc /lfs/h2/emc/ptmp/jianping.huang/emc.para/tmp/run_fcst.2023050200/RESTART/fv_tracer.res.tile1.nc
no2 is different
@ytangnoaa is this tide to biogenics?
CO has no biogenic source. "CO_ant" in the splited emission files of the two runs are the same.
@BrianCurtis-NOAA would you help us to exam the configuration in these two forecast DATA location: /lfs/h2/emc/ptmp/jianping.huang/emc.para/tmp/run_fcst.2023050200 - @JianpingHuang-NOAA run /lfs/h2/emc/stmp/lin.gan/aqm/ecflow_aqm/aqm_forecast_00.2023050200 - my run Thanks
lgannoaa where are the corresponding log files?
@lgannoaa
I did rerun for 00z cycle on 20230502, the output files can be found from /lfs/h2/emc/ptmp/jianping.huang/emc.para/com/aqm/v7.0/aqm.20230502/00
And the previous run output files are saved at /lfs/h2/emc/ptmp/jianping.huang/emc.para/com/aqm/v7.0/aqm.20230502_Lin_package/00
They are identical. This means that our results are reproducible
/lfs/h2/emc/ptmp/lin.gan/ecflow_aqm/para/output/prod/today/aqm_nexus_post_split_00.o57118921 (line 2327) Used ${HOMEaqm}/sorc/arl_nexus/utils/python/combine_ant_bio.py utility to create /lfs/h2/emc/ptmp/lin.gan/ecflow_aqm/para/com/aqm/v7.0/aqm.20230502/00/aqm.t00z.NEXUS_Expt.nc. This file is used by forecast job. This file is different in AACD between Jianping and my run. It is possible source that result in the forecast job output different.
On Cactus My nexus_post_split job log: /lfs/h2/emc/ptmp/lin.gan/ecflow_aqm/para/output/prod/today/aqm_nexus_post_split_00.o57118921 My fcst job log: /lfs/h2/emc/ptmp/lin.gan/ecflow_aqm/para/output/prod/today/aqm_forecast_00.o57120862 @JianpingHuang-NOAA Would you please provide your run information here: Jianping nexus_post_split job log: ? Jianping fcst job log: ?
I wrote a driver to test this issue. Using the same forecast DATA directory and only replaced the aqm.t00z.NEXUS_Expt.nc to the one from Jianping's COM location. As expected, the forecast output cmp bit identical between Jianping's run and my driver test output. Therefore, the root cause of the fcst output difference is the aqm.t00z.NEXUS_Expt.nc.
@lgannoaa lfs/h2/emc/ptmp/jianping.huang/emc.para/output/20230502/ nexus_emission_2023050200_s00.id_1683222855.log .... nexus_emission_2023050205_s02.id_1683222855.log nexus_post_split_2023050200.id_1683222855.log run_fcst_2023050200.id_1683222855.log
I checked the other intermediate files between these two runs:
NEXUS_Expt_combined.nc NEXUS_Expt_pretty.nc
and CO etc are identical. The issue looks caused by the last step "combine_ant_bio.py"
@ytangnoaa is this tide to biogenics?
Barry, it is indeed caused by the biogenic difference. It is strange that CO emission are affected.
@JianpingHuang-NOAA in checking with your nexus_emission job log. For example the: /lfs/h2/emc/ptmp/jianping.huang/emc.para/output/20230502/nexus_emission_2023050200_s00.id_1683165202.log Compare to my ecflow run log: /lfs/h2/emc/ptmp/lin.gan/ecflow_aqm/para/output/prod/today/aqm_nexus_emission_00_00.o57118629 Looks like your job did not find GFS sfc files in line 1119. My job log line 2348 show it was found in /lfs/h2/emc/stmp/lin.gan/aqm/ecflow_aqm/aqm_nexus_gfs_sfc_00.2023050200
exregional_nexus_emission.sh line 92 require GFS_SFC_INPUT to point to right location where your first job nexus_gfs_sfc linked to.
A check on your nexus_gfs_sfc_2023050200.id_1683165202.log. Those files were found and linked to GFS_SFC_STAGING_DIR=/lfs/h2/emc/ptmp/jianping.huang/emc.para/tmp/nexus_gfs_sfc.2023050200
Please modify your configuration to ensure GFS_SFC_INPUT is assigned as /lfs/h2/emc/ptmp/jianping.huang/emc.para/tmp/nexus_gfs_sfc.2023050200 in your nexus_emission job. Rerun your job and let us know if this action fixed the issue. Thanks
I think this has been resolved. @lgannoaa Do you have any more comments on this?
We may close this ticket. The root cause has been found. Issue resolved.
Both Lin Gan (EIB) and I are testing the same package (exe, j-jobs and ex-scripts etc.) for 20230502 at 00z cycle. But we are seeing difference between both workflow generated NEXUS emission inputs, dyn, phy and aqm.prod files between two runs.
The differences include 1) AACD for the NEXUS_Expt.nc files 2) CO for dyn files 3) AOD for phy files 4) ozone for *aqm.prod.nc files
Lin's ecflow-generated input/output files: /lfs/h2/emc/ptmp/lin.gan/ecflow_aqm/para/com/aqm/v7.0/aqm.20230502/00 Jianping's rocoto-generated input/output files: /lfs/h2/emc/ptmp/jianping.huang/emc.para/com/aqm/v7.0/aqm.20230502/00
@bbakernoaa Can your group check why ACCD is different two workflows-generated NEXUS emission files?
ecflow package is at /lfs/h2/emc/physics/noscrub/jianping.huang/nwdev/packages/aqm.v7.0.71L/ush
@lgannoaa Can you provide your package location here?