Open rsignell-usgs opened 3 years ago
Hi @rsignell-usgs and @abarciauskas-bgse I opened the issue #17 for the GEFSv12 data - which are also on AWS in grib2 format. Best!
Awesome thanks @chiaral
I just saw this tweet: https://twitter.com/tayloragowan/status/1360032380560441348
Did you know that HRRR model output is available in @zarr_dev format? I developed the dataset for part of my PhD w/ help from the AWS Sustainability Initiative! You can find more info here https://registry.opendata.aws/noaa-hrrr-pds/ or watch my defense on Tuesday at 1pm MST! DM for Zoom link.
@rsignell-usgs @rabernat should we share Rich's approach to a Zarr data store? It's a different approach than is hosted on AWS and I understand more useful to Rich's use case
https://github.com/blaylockbk/HRRR_archive_download/issues/2#issuecomment-763713889
I believe that this recipe (HRRR to Zarr) should work today with copy_inputs_to_local_file=True
, plus xarray_open_kwargs={'engine': 'cfgrib'}
. Someone should try it.
Source Dataset
The High-Resolution Rapid Refresh (HRRR) forecast model is the highest resolution (3 km) met model from NOAA that covers the entire US. The forecast archive from 2014 to the present is available as part of the NOAA Big Data Program on AWS. We want the data for forecast hour 01.
link: https://noaa-hrrr-bdp-pds.s3.amazonaws.com/index.html format: grib2 access: AWS s3
Transformation / Alignment / Merging
It would be great to form a best time series using the data from forecast hour T01.
It turns out that instead of reading the grib2 files with
engine=cfgrib
, it's faster to download the grib2, convert them to netcdf using wgrib2 and then load the resulting netcdf file into xarray.wget https://noaa-hrrr-bdp-pds.s3.amazonaws.com/hrrr.20190101/conus/hrrr.t00z.wrfnatf01.grib2
Output Dataset
Zarr output chunked thusly:
{'time':72, 'x':600, 'y':600}