NOAA-EMC / graphcast

GraphCastGFS
https://graphcastgfs.readthedocs.io/en/latest/index.html
Apache License 2.0
17 stars 6 forks source link

Update `gdas_utility.py` to use pygrib package for extracting variables instead of wgrib2 #18

Closed SadeghTabas-NOAA closed 5 months ago

SadeghTabas-NOAA commented 5 months ago

Currently, we use gdas_utilty.py to produce input states for the GraphCast model from GDAS. The utility uses the wgrib2 library to extract variables in different levels from grib2 files generated by GDAS. However, it is recommended to use pygrib instead as it would need less computation for extracting and merging the data. A very similar implementation of this is provided in the following link as an example. The results of the updated version of gdas_utility.py need to be verified with the current version to prevent discrepancies.

Sample: https://github.com/NOAA-EMC/ML4BC/blob/main/gen_training_0.25d.py gdas_utility.py that needs to be updated: https://github.com/NOAA-EMC/graphcast/blob/main/NCEP/gdas_utility.py

SadeghTabas-NOAA commented 5 months ago

@LinlinCui-NOAA Please also check which option is faster pygrib or wgrib2. If wgrib2 is faster, you may update the script to have both of them and the user decides about the one he/she wants to use (there might be some machines that we don't have wgrib2 library already installed on). In case pygrib is faster, just replace wgrib2 with pygrib.

To check which one is faster please run both scripts (current version and updated version) for 10 days (e.g., from 2022010100 to 2022011100) and check the runtime. Thanks