Open forsyth2 opened 6 months ago
@tomvothecoder Re: https://github.com/E3SM-Project/zppy/pull/519#issuecomment-2127712530 -- this PR includes the changes that couldn't be included in #519 since the e3sm_diags
refactor isn't finished yet. My initial testing suggests this PR's implementation will be sufficient to fully remove CDAT dependencies from zppy
. But we'll have to wait to merge it until after the e3sm_diags
refactor is done.
@tomvothecoder I created a new "min case" cfg for testing sets that have already been migrated. They appear to generate alright using E3SM Diags built off the latest CDAT branch: https://web.lcrc.anl.gov/public/e3sm/diagnostic_output/ac.forsyth2/zppy_min_case_e3sm_diags_cdat_migrated_www/issue-346-diags-20240802/v3.LR.historical_0051/e3sm_diags/
I ran zppy -c tests/integration/generated/test_min_case_e3sm_diags_cdat_migrated_sets_chrysalis.cfg
.
I specifically set:
sets="lat_lon","tc_analysis","enso_diags","streamflow",
to test the latest sets added on https://github.com/E3SM-Project/e3sm_diags/commits/cdat-migration-fy24 (apparently missing qbo
as a recent set). [Note I still included lat_lon
as a set since I recall E3SM Diags sometimes running into problems if that set isn't run first.]
These three sets unfortunately ran into some errors. https://web.lcrc.anl.gov/public/e3sm/diagnostic_output/ac.forsyth2/zppy_min_case_e3sm_diags_cdat_migrated_www/test-diags-no-cdat-20240917/v3.LR.historical_0051/e3sm_diags/atm_monthly_180x360_aave/model_vs_obs_1987-1988/viewer/ only shows lat_lon
. Now, by #625, apparently tc_analysis
wasn't even showing up correctly on main
. But that still means enso_diags
and streamflow
ran into some issues on the migrated branch.
$ cd /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_e3sm_diags_cdat_migrated_output/test-diags-no-cdat-20240917/v3.LR.historical_0051/post/scripts
$ grep -v "OK" *status
# No outright errors
$ grep -in tc_analysis e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1987-1988.o585928
$ grep -in enso_diags e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1987-1988.o585928
$ grep -in streamflow e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1987-1988.o585928
tc_analysis
:
IndexError: list index out of range
This error matches the one on main
, as noted in #625
enso_diags
:
OSError: No files found for target variable TAUX or derived variables ([('tauu',), ('surf_mom_flux_\
U',)]) in ts.
OSError: No files found for target variable LHFLX or derived variables ([('hfls',), ('QFLX',), ('su\
rface_upward_latent_heat_flux',)]) in ts.
OSError: No files found for target variable SHFLX or derived variables ([('hfss',), ('surf_sens_flu\
x',)]) in ts
OSError: No files found for target variable LHFLX or derived variables ([('hfls',), ('QFLX',), ('su\
rface_upward_latent_heat_flux',)]) in ts.
OSError: No files found for target variable SHFLX or derived variables ([('hfss',), ('surf_sens_flu\
x',)]) in ts.
OSError: No files found for target variable NET_FLUX_SRF or derived variables ([('FSNS', 'FLNS', 'Q\
FLX', 'PRECC', 'PRECL', 'PRECSC', 'PRECSL', 'SHFLX'), ('FSNS', 'FLNS', 'LHFLX', 'SHFLX'), ('FSNS', \
'FLNS', 'QFLX', 'SHFLX'), ('rsds', 'rsus', 'rlds', 'rlus', 'hfls', 'hfss')]) in ts.
OSError: No files found for target variable PRECT or derived variables ([('PRECT',), ('pr',), ('PRE\
CC', 'PRECL'), ('sat_gauge_precip',), ('PrecipLiqSurfMassFlux', 'PrecipIceSurfMassFlux')]) in ts.
OSError: No files found for target variable TAUX or derived variables ([('tauu',), ('surf_mom_flux_\
U',)]) in ts.
OSError: No files found for target variable TAUY or derived variables ([('tauv',), ('surf_mom_flux_\
V',)]) in ts.
OSError: No files found for target variable FSNS or derived variables ([('sfc_net_sw_all_mon',), ('\
rsds', 'rsus')]) in ts.
OSError: No files found for target variable NET_FLUX_SRF or derived variables ([('FSNS', 'FLNS', 'Q\
FLX', 'PRECC', 'PRECL', 'PRECSC', 'PRECSL', 'SHFLX'), ('FSNS', 'FLNS', 'LHFLX', 'SHFLX'), ('FSNS', \
'FLNS', 'QFLX', 'SHFLX'), ('rsds', 'rsus', 'rlds', 'rlus', 'hfls', 'hfss')]) in ts.
streamflow
:
OSError: No time series `.nc` file was found for 'RIVER_DISCHARGE_OVER_LAND_LIQ' in 'rof'
The enso_diags
and streamflow
errors seem to suggest missing data, but this confuses me because the comprehensive_v3 test uses the same simulation output as its data source...
@chengzhuzhang @tomvothecoder
https://web.lcrc.anl.gov/public/e3sm/diagnostic_output/ac.forsyth2/zppy_min_case_e3sm_diags_cdat_migrated_www/test-diags-no-cdat-20240917/v3.LR.historical_0051/e3sm_diags/atm_monthly_180x360_aave/model_vs_obs_1987-1988/viewer/ shows only lat_lon
despite tests/integration/template_min_case_e3sm_diags_cdat_migrated_sets.cfg
having sets="lat_lon","tc_analysis","enso_diags","streamflow",
.
The relevant output directory can be seen here, and is also mentioned in a code block above.
$ cd /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_e3sm_diags_cdat_migrated_output/test-diags-no-cdat-20240917/v3.LR.historical_0051/post/scripts
$ grep -v "OK" *status
# No obvious failures
# Possibly useful commands:
$ grep -in error e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1987-1988.o585928
$ grep -in tc_analysis e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1987-1988.o585928
$ grep -in enso_diags e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1987-1988.o585928
$ grep -in streamflow e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1987-1988.o585928
Note https://web.lcrc.anl.gov/public/e3sm/diagnostic_output/ac.forsyth2/zppy_min_case_e3sm_diags_cdat_migrated_www/test-diags-no-cdat-20240917/v3.LR.historical_0051/e3sm_diags/atm_monthly_180x360_aave_mvm/model_vs_model_1987-1988/viewer/ also shows only lat_lon
in the final viewer.
@forsyth2 I'm looking at the log files (as below), it appears that time series files are not correctly feed into e3sm_diags, which means the code change that removes cdscan
step is not working with e3sm_diags
s new code base correctly.
>cp: cannot create regular file 'ts/FSNTOA_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/FLUT_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/FSNT_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/FLNT_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/FSNS_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/FLNS_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/SHFLX_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/QFLX_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/TAUX_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/TAUY_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/PRECC_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/PRECL_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/PRECSC_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/PRECSL_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/TS_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/TREFHT_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/CLDTOT_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/CLDHGH_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/CLDMED_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/CLDLOW_1987*.nc': No such file or directory
cp: cannot create regular file 'ts/U_1987*.nc': No such file or directory
cp: target 'RIVER_DISCHARGE_OVER_LAND_LIQ_*.nc' is not a directory
Thanks @chengzhuzhang. I haven't had problems with previous added sets though. I'll try to debug this further to see if it's something with how this PR handles these sets in particular...
@tomvothecoder we will need to coordinate on this issue with e3sm_diags refactoring.
The original cdscan codes basically subset and extract the metadata from input file for the specified time period and generate an xml
file with file name as T_startTime_endTime.xml
which cdms2
is able to read and process.
In the new e3sm_diags
code, for time-series, we use xarray.open_dataset
, and the input needs to be one single .nc file. I think we may need to update to xarray.open_mfdataset
(https://github.com/E3SM-Project/e3sm_diags/blob/7b53932d72ef7782818a5bd1e4e9d527420fec49/e3sm_diags/driver/utils/dataset_xr.py#L1035)which allows multiple files, and also to support subset time based on start and end years (i think the later is already supported by https://github.com/E3SM-Project/e3sm_diags/blob/7b53932d72ef7782818a5bd1e4e9d527420fec49/e3sm_diags/driver/utils/dataset_xr.py#L1185 ?).
Thanks @chengzhuzhang. I haven't had problems with previous added sets though. I'll try to debug this further to see if it's something with how this PR handles these sets in particular...
Could you clarify "the previous added sets", i think the issue only relevant for time-series based diagnostics sets. For sets depends on climo datasets, they should be fine.
Thanks @chengzhuzhang. I haven't had problems with previous added sets though. I'll try to debug this further to see if it's something with how this PR handles these sets in particular...
Could you clarify "the previous added sets", i think the issue only relevant for time-series based diagnostics sets. For sets depends on climo datasets, they should be fine.
@forsyth2 I saw the test results from Aug 6, yes by then those sets only rely on climo files.
@chengzhuzhang Does this mean the bug is for certain in the Diags code or should I dive deeper into the zppy changes in this PR?
@chengzhuzhang Does this mean the bug is for certain in the Diags code or should I dive deeper into the zppy changes in this PR?
It is not a bug in diags. We can also add the time selecting and slicing code in zppy
to be able to remove cdscan. We should have a discussion with @tomvothecoder on how best to approach this.
@chengzhuzhang Does this mean the bug is for certain in the Diags code or should I dive deeper into the zppy changes in this PR?
It is not a bug in diags.
To summarize this issue, based on what I'm reading:
lat_lon
works fine because the refactored e3sm_diags uses xc.open_mfdataset()
for climatology datasets (link), which supports directories or globs for .nc
files (like how zppy
specifies here)enso_diags
and streamflow
fail because they use time series datasets (via a directory). The refactored e3sm_diags uses xc.open_dataset()
for time series files (link). tc_analysis
seems to have been failing on main
and dev branch for another reasonI will open a PR to update e3sm_diags to use xc.open_mfdataset()
for time series files (here).
also to support subset time based on start and end years (i think the later is already supported by E3SM-Project/e3sm_diags@
7b53932
/e3sm_diags/driver/utils/dataset_xr.py#L1185 ?). We can also add the time selecting and slicing code inzppy
to be able to remove cdscan. We should have a discussion with @tomvothecoder on how best to approach this.
The time slicing and subsetting code you listed parses the filepath to get the start and end years. I'm not sure if this currently works with a directory specified as an input path.
@forsyth2 Can you provide the input paths for the test data? I am trying to reproduce your errors by running prov/e3sm.py but the test_ts
is a local reference.
enso_diags
andstreamflow
fail because they use time series datasets (via a directory). The refactored e3sm_diags usesxc.open_dataset()
for time series files ([link](https://github.com/E3SM-Project/e3sm_diags/blob/22cd0f569e7df4be147a8fe6fae3e00d5aeb9d8c/e3sm_diags/driver/utils/dataset_xr.py#L1035-
Note, opening multiple time series files is actually supported already if deriving from other variables.
If we are trying to open up multiple datasets for the same variable (e.g., split up by months or something) without deriving, then we need to add it in #861.
@tomvothecoder
Can you provide the input paths for the test data?
Everything is specified in the cfg I have copied in https://github.com/E3SM-Project/zppy/pull/598#issuecomment-2384025994. Specifically:
input = /lcrc/group/e3sm2/ac.wlin/E3SMv3/v3.LR.historical_0051
output = "/lcrc/group/e3sm/ac.forsyth2/zppy_min_case_e3sm_diags_cdat_migrated_output/test-diags-no-cdat-20240917/v3.LR.historical_0051"
For the test_ts
directory specifically, I believe that would be copied from:
/lcrc/group/e3sm/ac.forsyth2/zppy_min_case_e3sm_diags_cdat_migrated_output/test-diags-no-cdat-20240917/v3.LR.historical_0051/post/atm/180x360_aave/ts/monthly/2yr
TODO:
zppy_min_case_e3sm_diags_cdat_migrated
Resolves #346. Resolves #80. Follow-up to #519.