Open idharssi2020 opened 11 months ago
KeyError: 'msl0'
I've created initial conditions for graphcast (GC) using files download from https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/ I use eccodes to extract the required fields for 0Z and 6Z
I run GC using ai-models --input file --file dump.grib --date $1 --time 0600 --expver gfs1 graphcast --debug 2> log.txt >log.txt
I get the following errors
tail -20 log.txt
sys.exit(main())
File "/home/548/ixd548/.conda/envs/ai_models0211/lib/python3.10/site-packages/ai_models/__main__.py", line 285, in main
_main()
File "/home/548/ixd548/.conda/envs/ai_models0211/lib/python3.10/site-packages/ai_models/__main__.py", line 258, in _main
model.run()
File "/g/data/dp9/ixd548/ai-models/ai-models-graphcast/ai_models_graphcast/model.py", line 248, in run
save_output_xarray(
File "/g/data/dp9/ixd548/ai-models/ai-models-graphcast/ai_models_graphcast/output.py", line 34, in save_output_xarray
all_fields = all_fields.order_by(
File "/home/548/ixd548/.local/lib/python3.10/site-packages/climetlab/core/index.py", line 210, in order_by
indices = sorted(indices, key=functools.cmp_to_key(cmp))
File "/home/548/ixd548/.local/lib/python3.10/site-packages/climetlab/core/index.py", line 207, in cmp
return order.compare_elements(self[i], self[j])
File "/home/548/ixd548/.local/lib/python3.10/site-packages/climetlab/core/index.py", line 87, in compare_elements
n = v(a_metadata(k), b_metadata(k))
File "/home/548/ixd548/.local/lib/python3.10/site-packages/climetlab/core/index.py", line 120, in __call__
return ascending(self.get(a), self.get(b))
File "/home/548/ixd548/.local/lib/python3.10/site-packages/climetlab/core/index.py", line 123, in get
return self.order[x]
KeyError: 'msl0'
I think the issue is the way surface levels are represented in the input file dump.grib. If I use grib_ls
grib_ls dump.grib
dump.grib
edition centre date dataType gridType stepRange typeOfLevel level shortName packingType
...
2 kwbc 20231013 fc regular_ll 0 isobaricInhPa 1000 z grid_complex_spatial_differencing
2 kwbc 20231013 fc regular_ll 0 surface 0 lsm grid_complex_spatial_differencing
2 kwbc 20231013 fc regular_ll 0 surface 0 z grid_complex_spatial_differencing
2 kwbc 20231013 fc regular_ll 0 surface 0 tp grid_complex_spatial_differencing
2 kwbc 20231013 fc regular_ll 0 meanSea 0 msl grid_complex_spatial_differencing
2 kwbc 20231013 fc regular_ll 0 heightAboveGround 10 10u grid_complex_spatial_differencing
2 kwbc 20231013 fc regular_ll 0 heightAboveGround 10 10v grid_complex_spatial_differencing
2 kwbc 20231013 fc regular_ll 0 heightAboveGround 2 2t grid_complex_spatial_differencing
I tried to use grib_set -s typeOfLevel=surface,level=0 , but this changes the variable names from msl,2t,10u,10v to sp,t,u,v
If I use grib_set -s shortName= , the level and type are reset back to the old values
Would it be possible to change the code so that the level height isn't used as part of the key for surface variables?
It looks as though GC finished as a file output.nc is generated du -sh * 1.2G 00 1.2G 06 151M dump.grib 28K dump.txt 159M forcings_xr.nc 674M input_xr.nc 116K log.txt 13G output.nc 0 params 0 stats 14G training_xarray.nc
I commented out the all_fields.order_by and changed the if level != 0 to if level > 20. My modified code looks like
def save_output_xarray(
*,
output,
target_variables,
write,
all_fields,
ordering,
lead_time,
hour_steps,
lagged,
):
LOG.info("Converting output xarray to GRIB and saving")
output["total_precipitation_6hr"] = output.data_vars[
"total_precipitation_6hr"
].cumsum(dim="time")
# all_fields = all_fields.order_by(
# valid_datetime="descending",
# param=ordering,
# #remapping={"param_level": "{param}{levelist}"},
# )
for time in range(lead_time // hour_steps):
for fs in all_fields[: len(all_fields) // len(lagged)]:
param, level = fs["shortName"], fs["level"]
if level > 20:
param = GRIB_TO_XARRAY_PL.get(param, param)
if param not in target_variables:
continue
values = output.isel(time=time).sel(level=level).data_vars[param].values
else:
param = GRIB_TO_CF.get(param, param)
param = GRIB_TO_XARRAY_SFC.get(param, param)
if param not in target_variables:
continue
values = output.isel(time=time).data_vars[param].values
# We want to field north=>south
values = np.flipud(values.reshape(fs.shape))
write(
values,
template=fs,
step=(time + 1) * hour_steps,
)
KeyError: 'msl0'
I've created initial conditions for graphcast (GC) using files download from https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/ I use eccodes to extract the required fields for 0Z and 6Z
I run GC using ai-models --input file --file dump.grib --date $1 --time 0600 --expver gfs1 graphcast --debug 2> log.txt >log.txt
I get the following errors
tail -20 log.txt sys.exit(main()) File "/home/548/ixd548/.conda/envs/ai_models0211/lib/python3.10/site-packages/ai_models/__main__.py", line 285, in main _main() File "/home/548/ixd548/.conda/envs/ai_models0211/lib/python3.10/site-packages/ai_models/__main__.py", line 258, in _main model.run() File "/g/data/dp9/ixd548/ai-models/ai-models-graphcast/ai_models_graphcast/model.py", line 248, in run save_output_xarray( File "/g/data/dp9/ixd548/ai-models/ai-models-graphcast/ai_models_graphcast/output.py", line 34, in save_output_xarray all_fields = all_fields.order_by( File "/home/548/ixd548/.local/lib/python3.10/site-packages/climetlab/core/index.py", line 210, in order_by indices = sorted(indices, key=functools.cmp_to_key(cmp)) File "/home/548/ixd548/.local/lib/python3.10/site-packages/climetlab/core/index.py", line 207, in cmp return order.compare_elements(self[i], self[j]) File "/home/548/ixd548/.local/lib/python3.10/site-packages/climetlab/core/index.py", line 87, in compare_elements n = v(a_metadata(k), b_metadata(k)) File "/home/548/ixd548/.local/lib/python3.10/site-packages/climetlab/core/index.py", line 120, in __call__ return ascending(self.get(a), self.get(b)) File "/home/548/ixd548/.local/lib/python3.10/site-packages/climetlab/core/index.py", line 123, in get return self.order[x] KeyError: 'msl0'
I think the issue is the way surface levels are represented in the input file dump.grib. If I use grib_ls
grib_ls dump.grib dump.grib edition centre date dataType gridType stepRange typeOfLevel level shortName packingType ... 2 kwbc 20231013 fc regular_ll 0 isobaricInhPa 1000 z grid_complex_spatial_differencing 2 kwbc 20231013 fc regular_ll 0 surface 0 lsm grid_complex_spatial_differencing 2 kwbc 20231013 fc regular_ll 0 surface 0 z grid_complex_spatial_differencing 2 kwbc 20231013 fc regular_ll 0 surface 0 tp grid_complex_spatial_differencing 2 kwbc 20231013 fc regular_ll 0 meanSea 0 msl grid_complex_spatial_differencing 2 kwbc 20231013 fc regular_ll 0 heightAboveGround 10 10u grid_complex_spatial_differencing 2 kwbc 20231013 fc regular_ll 0 heightAboveGround 10 10v grid_complex_spatial_differencing 2 kwbc 20231013 fc regular_ll 0 heightAboveGround 2 2t grid_complex_spatial_differencing
I tried to use grib_set -s typeOfLevel=surface,level=0 , but this changes the variable names from msl,2t,10u,10v to sp,t,u,v
If I use grib_set -s shortName= , the level and type are reset back to the old values
Would it be possible to change the code so that the level height isn't used as part of the key for surface variables?
It looks as though GC finished as a file output.nc is generated du -sh * 1.2G 00 1.2G 06 151M dump.grib 28K dump.txt 159M forcings_xr.nc 674M input_xr.nc 116K log.txt 13G output.nc 0 params 0 stats 14G training_xarray.nc
Hello, could you please provide me with more information on how to initialize GraphCast using GFS products? I am interested in utilizing GFS products for real-time data analysis, but I am unsure where to begin. Your assistance in refining the above statements would be greatly appreciated.
I use eccodes command line tools (https://confluence.ecmwf.int/display/ECC/GRIB+tools+examples) to extract and process the NCEP GFS analyses which are already in grib2 format and on pressure levels. I needed to convert some units and so scale some of the GFS fields. I also needed to rename some of the variables. My script is only 27 lines long. I'm still checking the script and will share it when it is ready.
I use eccodes command line tools (https://confluence.ecmwf.int/display/ECC/GRIB+tools+examples) to extract and process the NCEP GFS analyses which are already in grib2 format and on pressure levels. I needed to convert some units and so scale some of the GFS fields. I also needed to rename some of the variables. My script is only 27 lines long. I'm still checking the script and will share it when it is ready.
Thanks! Looking forward to your updates!
I download 4 grib files (06_plev, 06_surface, 12_plev, 12_surface) and use them to run graphcast as below: "ai-models --input file --file graphcast_intput_20230909_06_plev.grib graphcast_intput_20230909_06_surface.grib graphcast_intput_20230909_12_plev.grib graphcast_intput_20230909_12_surface.grib graphcast" the error is : [ ai-models: error: argument MODEL: invalid choice: 'graphcast_intput_20230909_06_surface.grib' (choose from 'graphcast') ]
should I combine the 4 files to 1 file? Thanks! I combine the 4 files to 1 file (graphcast_intput_20231001.grib) and run as: ai-models --input file --file graphcast_intput_20231001.grib graphcast
the error is
2024-01-09 19:40:41,159 INFO Loading params/GraphCast_operational - ERA5-HRES 1979-2021 - resolution 0.25 - pressure levels 13 - mesh 2to6 - precipitation output only.npz: 0.3 second.
2024-01-09 19:40:41,159 INFO Building model: 0.3 second.
2024-01-09 19:40:41,565 INFO Creating training data: 0.4 second.
2024-01-09 19:40:41,565 INFO Creating input data (total): 0.4 second.
2024-01-09 19:40:41,566 INFO Total time: 1 second.
Traceback (most recent call last):
File "/data7/01_ai_models/miniconda/envs/ai-models/bin/ai-models", line 8, in
the input files are downloaded from mars, I transfer the cache file to grib file.
I have run graphcast successfully, using ERA5 reanalysis fields obtained from CDS. To test out the --file option, I have tried giving the same [albeit concatenated and renamed] CDS files back to graphcast, but am running into a problem.
The successful run obtained these files from CDS: cds-retriever-fc80dd0245970d72ee767b0af3106647ef559ca2d5e4bb2e0fd32d4e9b39f1c2.cache cds-retriever-874becd411c61b19487676bd6137133b4abd69f531f552aca9d42af8c96bf817.cache
I moved them to my ai-models directory, combined them into a single file (via unix cat), and renamed it as "combined_file.grib". Then tried to run graphcast as
ai-models --file combined_file.grib --date 20211230 --time 12 --path 'out-{step}.grib' --lead-time 24 --debug graphcast
... which resulted in this error
2024-01-09 13:08:57,623 INFO Creating input data (total): 0.4 second.
2024-01-09 13:08:57,623 INFO Total time: 2 seconds.
Traceback (most recent call last):
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/bin/ai-models", line 8, in <module>
sys.exit(main())
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/lib/python3.10/site-packages/ai_models/__main__.py", line 297, in main
_main()
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/lib/python3.10/site-packages/ai_models/__main__.py", line 270, in _main
model.run()
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/lib/python3.10/site-packages/ai_models_graphcast/model.py", line 201, in run
training_xarray, time_deltas = create_training_xarray(
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/lib/python3.10/site-packages/ai_models_graphcast/input.py", line 84, in create_training_xarray
forcing_numpy = forcing_variables_numpy(
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/lib/python3.10/site-packages/ai_models_graphcast/input.py", line 49, in forcing_variables_numpy
ds = cml.load_source(
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/lib/python3.10/site-packages/climetlab/sources/__init__.py", line 178, in load_source
src = get_source(name, *args, **kwargs)
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/lib/python3.10/site-packages/climetlab/sources/__init__.py", line 159, in __call__
source = klass(*args, **kwargs)
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/lib/python3.10/site-packages/climetlab/core/__init__.py", line 25, in __call__
obj.__init__(*args, **kwargs)
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/lib/python3.10/site-packages/climetlab/sources/constants.py", line 269, in __init__
self.numbers = find_numbers(source_or_dataset)
File "/network/rit/lab/fovelllab_rit/anaconda3/envs/ai/lib/python3.10/site-packages/climetlab/sources/constants.py", line 234, in find_numbers
return source_or_dataset.unique_values(
KeyError: 'number'
Any ideas? Thanks!
@I-Dhar
I use eccodes command line tools (https://confluence.ecmwf.int/display/ECC/GRIB+tools+examples) to extract and process the NCEP GFS analyses which are already in grib2 format and on pressure levels. I needed to convert some units and so scale some of the GFS fields. I also needed to rename some of the variables. My script is only 27 lines long. I'm still checking the script and will share it when it is ready.
Curious if you figured it out?
Dear Developers,
Thanks so much for making your ai-models wrapper available. I have graphcast running with the wrapper. Next I would like to use a local analysis created using 4DEnVar to initialise graphcast. The analysis is Global and has a resolution of about 12km. Data is available at all the required pressure levels.
Should the source file be grib or netCDF?
Do I need to regrid the analyses to 0.25 degrees resolution? I can use CDO to do this but just wondered if it is necessary.
Also, graphcast needs analyses at two time periods, 6 hours apart. What time-stamps should I put on the input file?
Apologies if this is all in the documentation. I might figure all this out by trial and error but any extra guidance is much appreciated.
Thanks