Open StevePny opened 7 months ago
fregrid
is maybe what you are looking for. Raw diagnostics and restart files from FV3-based models are output on the cubed sphere native grid, which is logically rectangular on each cubed sphere tile, but not globally. fregrid
can regrid fields[^1][^2] in those files to a regular latitude-longitude grid, which is globally rectangular, so a global horizontal field can be stored in a simple 2D array. This can be viewed as "combining" data from the tiles together, though keep in mind it is also a grid transformation (perfectly valid and frequently used of course!).
mppnccombine
is somewhat of a lower level tool that can be useful to preprocess files before sending them to a tool like fregrid
. It is only needed if the data was produced using an fv_core_nml.io_layout
not equal to 1, 1
.
For example, if an I/O layout of 2, 2
were used, you would have a set of 24 files per restart category, where each tile was broken up into four subdomains. For tile one of gfs_data
you would start from:
gfs_data.tile1.nc.0000
gfs_data.tile1.nc.0001
gfs_data.tile1.nc.0002
gfs_data.tile1.nc.0003
and then use mppnccombine
to produce the "combined" gfs_data.tile1.nc
file:
$ mppnccombine gfs_data.tile1.nc gfs_data.tile1.nc.*
You would do the same for tiles two through six. Your data is already combined in this sense, so the mppnccombine
step is not needed.
[^1]: You may run into trouble with this approach for the horizontal winds in restart files, however, which are on a staggered grid, which is something that I do not believe fregrid
supports. That may not be a concern if your interest is in other fields.
[^2]: Some fields in the surface restart files are categorical, e.g. land surface type, and so may not be amenable to regridding, which involves averaging or interpolation. Again, this may not be a concern depending on the variables of interest.
Hi @spencerkclark, thanks for your suggestions. And also thank you for the clarification on mppnccombine, which I have used before with MOM6 and incorrectly assumed it could be used similarly for this objective with FV3 as well.
We are currently using fregrid with FV3-SHiELD to interpolate the output to a regular global grid for our own runs. This works ok for viewing output globally and calculating basic statistics.
Unfortunately this does not work for us when using the GEFS archive data. Below I'm using the UFS/GFS mosaic and remap_weights files provided at: https://ftp.emc.ncep.noaa.gov/static_files/public/UFS/GFS/fix/fix_fv3/C384
After attempting to use fregrid, it seems that it has an issue when looking for lon as a variable, but cannot find it:
docker run -v /Users/spenny/Data/GEFS/p01:/rundir -it gfdl/fretools:latest fregrid --input_mosaic C384_mosaic.nc --nlon 1440 --nlat 720 --lonBegin 0 --lonEnd 360 --latBegin -90 --latEnd 90 --input_file gfs_data --output_file gfs_data.nc --input_dir . --scalar_field t --interp_method conserve_order1 --remap_file remap_weights_C384_1deg.nc
****fregrid: first order conservative scheme will be used for regridding.
Error from pe 0: mpp_io(mpp_get_varid): error in get field_id of variable lon from file ./gfs_data.tile1.nc: NetCDF: Variable not found
I'm guessing either a different set of mosaic files are needed, specifically for GEFS (though I'm not sure where to find these), or there is a fregrid setting needed to tell it where to look for lat/lon information.
Here is the GEFS file header:
Here is the C384 mosaic grid tile 1 header:
Hi, I'm trying to apply mppnccombine to produce a combined file from the 6-tiled restart data from GEFS, which I assume is using the GFDL FMS: https://noaa-gefs-pds.s3.amazonaws.com/index.html#gefs.20240208/00/atmos/init/p01/
[EDIT: @spencerkclark mentioned here: https://github.com/pydata/xarray/discussions/8730 that mppnccombine is the wrong tool for this purpose - What FRE-NCtools should be used to do this instead - fregrid? combine_restarts?]
I assume mppnccombine is the appropriate tool to use here. I've tried building FRE-NCtools in a docker image both on my own laptop (mac M1) and on an AWS ec2 instance running ubuntu.
In both cases, I run something like:
And yet the resulting gfs_data.nc file looks like:
which looks to me that the final file is only the first tile and the combine operation has failed.
Am I using the tool incorrectly, or is there another tool that is more appropriate for this dataset?