E3SM-Project / scream

Fork of E3SM used to develop exascale global atmosphere model written in C++
https://e3sm-project.github.io/scream/
Other
80 stars 55 forks source link

Netcdf files need to be certain format #1436

Closed ndkeen closed 2 years ago

ndkeen commented 2 years ago

I think there are only certain netcdf formats that we want to be using. For one case, a file that was in netCDF-4 caused an issue that did not happen after converting it to cdf5.

To convert, can use ncks -5 a.nc -o b.nc (or similar)

cori09% pwd
/global/cfs/cdirs/e3sm/inputdata/atm/scream/init

+ ncdump -k homme_physics_ne2np4.nc
64-bit offset
+ ncdump -k homme_physics_ne2np4_20220204.nc
netCDF-4
+ ncdump -k homme_shoc_cld_p3_rrtmgp_init_ne2np4.nc
64-bit offset
+ ncdump -k homme_shoc_cld_spa_p3_rrtmgp_init_ne2np4.nc
64-bit offset
+ ncdump -k homme_standalone_ne4np4.nc
64-bit offset
+ ncdump -k init_ne4np4.nc
64-bit offset
+ ncdump -k map_ne4np4_to_ne2np4_mono.nc
netCDF-4
+ ncdump -k p3_init_ne2np4.nc
64-bit offset
+ ncdump -k rrtmgp-allsky.nc
netCDF-4 classic model
+ ncdump -k rrtmgp-cloud-optics-coeffs-lw.nc
classic
+ ncdump -k rrtmgp-cloud-optics-coeffs-sw.nc
classic
+ ncdump -k rrtmgp-data-lw-g256-2018-12-04.nc
classic
+ ncdump -k rrtmgp-data-sw-g224-2018-12-04.nc
classic
+ ncdump -k rrtmgp_init_ne2np4.nc
64-bit offset
+ ncdump -k scream-aquaplanet_init_ne30np4_L128_20211202.nc
cdf5
+ ncdump -k scream-aquaplanet_init_ne4np4_L128_20211202.nc
cdf5
+ ncdump -k scream-aquaplanet_init_ne4np4_L72_20211202.nc
cdf5
+ ncdump -k scream_aquaplanet_ne4np4_L72.nc
64-bit offset
+ ncdump -k shoc_cld_p3_rrtmgp_init_ne2np4.nc
64-bit offset
+ ncdump -k shoc_cld_spa_p3_rrtmgp_init_ne2np4.nc
64-bit offset
+ ncdump -k shoc_init_ne2np4.nc
64-bit offset
+ ncdump -k spa_data_for_testing.nc
netCDF-4
+ ncdump -k spa_file_unified_and_complete_ne4_scream.nc
netCDF-4
+ ncdump -k spa_file_unified_and_complete_ne4_scream_cdf5.nc  (note I added this one as a test)
cdf5
+ ncdump -k spa_init_ne2np4.nc
netCDF-4
bartgol commented 2 years ago

Are all e3sm input files in cdf5 format? If not, then it might be that we are doing something wrong in our scorpio interfaces, which prevents us from reading netcdf-4 files...

PeterCaldwell commented 2 years ago

@bartgol - this is a long-running and strange situation. It is true that scorpio doesn't support netcdf4. As a result, all e3sm input files have been forced to be written in netcdf3 format. The reason for lack of support has to do with (if I remember correctly) parallel support for netcdf4 never really getting the attention it needed.

bartgol commented 2 years ago

Ok, we can definitely convert files to whatever scorpio can handle. I see several formats there though. I'm guessing netcdf-4 and netcdf-4 classic model are both bad. What about 64-bit offset or classic? Noel mentioned cdf5, which I think is yet another format...

jayeshkrishna commented 2 years ago

Please convert all the input files to the cdf5 (64bit data) file format. The NetCDF4 files have caused several issues (unexplained hangs etc) in the past and hasn't provided us the performance that we need.

(Also check out https://acme-climate.atlassian.net/wiki/spaces/DOC/pages/1007223420/NetCDF+explainer & https://acme-climate.atlassian.net/wiki/spaces/EIDMG/pages/769130507/Picking+a+netcdf+type+for+all+input+files)

ndkeen commented 2 years ago

From Charlie Zender: "There is some understandable confusion about this because the terminology is confusing and the supported formats have evolved. The PnetCDF format was fully merged into netCDF 4.6.3 a few years ago, where it is now more widely known as the CDF5 or 64bit-data format. The E3SM toolchain fully and transparently supports CDF5, which (like netCDF4) is 64bit end-to-end and thus capacious enough to hold any dataset through the exascale era. You can BFB convert from any netCDF format to CDF5 with:"

ncks -5 in.nc out.nc

ndkeen commented 2 years ago

This was resolved.