ESIPFed / esiphub-dev

Development JupyterHub on AWS targeting pangeo environment for National Water Model exploration
MIT License
2 stars 1 forks source link

National Water Model data from S3 to HSDS #6

Open rsignell-usgs opened 6 years ago

rsignell-usgs commented 6 years ago

As part of the NOAA Big Data Project, the National Water Model data is now on AWS S3:

From Conor Delaney:

From Just checked with CICs, the data is all there I just didn't understand how it was structured. To get to the a particular section of the data sets use the Prefix parameter. The archive data is collated by year and the forecast is collated by day. http://nwm-archive.s3.amazonaws.com/?prefix=2017 or http://noaa-nwm-pds.s3.amazonaws.com/?prefix=nwm.20180416

@jreadey what is the best way to convert some of this data to HSDS?

Here's the list of files that I would like to create the HSDS dataset from (using the RENCI opendap to illustrate the dataset):

import pandas as pd
import xarray as xr

root = 'http://tds.renci.org:8080/thredds/dodsC/nwm/forcing_short_range/'   # OPenDAP
dates = pd.date_range(start='2018-04-01T00:00', end='2018-04-07T23:00', freq='H')
urls = ['{}{}/nwm.t{}z.short_range.forcing.f001.conus.nc'.format(root,a.strftime('%Y%m%d'),a.strftime('%H')) for a in dates]
print('\n'.join(urls))

ds = xr.open_mfdataset(urls, concat_dim='time')
print(ds)

Could we just modify this somehow (perhaps using FUSE) to read the NetCDF files from S3?

rsignell-usgs commented 6 years ago

@ajelenak-thg, do you have a notebook demonstrating writing to HSDS from xarray?

Which makes me wonder: any plans for xr.to_hsds() functionality?

ghost commented 6 years ago

@ajelenak-thg, do you have a notebook demonstrating writing to HSDS from xarray?

No, @rsignell-usgs, I don't have a notebook for that.

Which makes me wonder: any plans for xr.to_hsds() functionality?

My thinking is that we would tackle this functionality through whatever xarray currently uses for writing out HDF5 files (ooops, I meant netCDF-4...): h5netcdf and h5pyd.

rsignell-usgs commented 6 years ago

@ajelenak-thg , oh yes, that makes sense.

jreadey commented 6 years ago

@ajelenak-thg - that's something we can try out right now isn't it? (Using your PR on h5netcdf)

ghost commented 6 years ago

I think so, and with the latest h5pyd. Haven't tried myself yet.