pangeo-forge / staged-recipes

A place to submit pangeo-forge recipes before they become fully fledged pangeo-forge feedstocks
https://pangeo-forge.readthedocs.io/en/latest/
Apache License 2.0
38 stars 63 forks source link

Example pipeline for HRRR #15

Open rsignell-usgs opened 3 years ago

rsignell-usgs commented 3 years ago

Source Dataset

The High-Resolution Rapid Refresh (HRRR) forecast model is the highest resolution (3 km) met model from NOAA that covers the entire US. The forecast archive from 2014 to the present is available as part of the NOAA Big Data Program on AWS. We want the data for forecast hour 01.

link: https://noaa-hrrr-bdp-pds.s3.amazonaws.com/index.html format: grib2 access: AWS s3

import fsspec
fs = fsspec.filesystem('s3', anon=True)
url = 'noaa-hrrr-bdp-pds'  # HRRR forecast archive

flist = fs.glob(url+'/hrrr.20190101/conus/hrrr.t*z.wrfnatf01.grib2')
flist

Transformation / Alignment / Merging

It would be great to form a best time series using the data from forecast hour T01.

It turns out that instead of reading the grib2 files with engine=cfgrib, it's faster to download the grib2, convert them to netcdf using wgrib2 and then load the resulting netcdf file into xarray.

wget https://noaa-hrrr-bdp-pds.s3.amazonaws.com/hrrr.20190101/conus/hrrr.t00z.wrfnatf01.grib2

Output Dataset

Zarr output chunked thusly: {'time':72, 'x':600, 'y':600}

chiaral commented 3 years ago

Hi @rsignell-usgs and @abarciauskas-bgse I opened the issue #17 for the GEFSv12 data - which are also on AWS in grib2 format. Best!

abarciauskas-bgse commented 3 years ago

Awesome thanks @chiaral

rabernat commented 3 years ago

I just saw this tweet: https://twitter.com/tayloragowan/status/1360032380560441348

Did you know that HRRR model output is available in @zarr_dev format? I developed the dataset for part of my PhD w/ help from the AWS Sustainability Initiative! You can find more info here https://registry.opendata.aws/noaa-hrrr-pds/ or watch my defense on Tuesday at 1pm MST! DM for Zoom link.

abarciauskas-bgse commented 3 years ago

@rsignell-usgs @rabernat should we share Rich's approach to a Zarr data store? It's a different approach than is hosted on AWS and I understand more useful to Rich's use case

https://github.com/blaylockbk/HRRR_archive_download/issues/2#issuecomment-763713889

rabernat commented 3 years ago

I believe that this recipe (HRRR to Zarr) should work today with copy_inputs_to_local_file=True, plus xarray_open_kwargs={'engine': 'cfgrib'}. Someone should try it.