Closed rajadain closed 2 years ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
This looks great! The only thing I would recommend is adding an assert at the end which ensures that both the datasets are indeed same. Maybe this?
dsz = xr.open_dataset(f'{PREDICTIONS_DATADIR}-channel_rt.zarr')
assert all(np.allclose(ds[v].to_numpy(), dsz[v].to_numpy(), equal_nan=True)
for v in ds.data_vars.keys() if len(ds[v].shape)>0)
Excellent!
On a separate note, I was wondering if we ought to have this whole thing as a function that gets the data for a particular day? Maybe something like get_geo_data
from this example notebook? If so, I propose:
def get_short_range_forecast_data(date: str) -> xr.Dataset:
....
return ds
## Called like
ds = get_short_range_forecast((datetime.datetime.now() - datetime.timedelta(1)).strftime('%Y%m%d'))
I have been unable to determine if the predictions data are available on s3...so we might be unable to use str glob for xr.open_mfdataset
. Regardless, I believe that encapsulating your notebook steps into a function might be helpful. Whatever you decide is perfectly fine with me!
The data is available on S3 as described here: https://docs.opendata.aws/noaa-nwm-pds/readme.html
aws s3 ls noaa-nwm-pds/nwm.20221019/
PRE analysis_assim/
PRE analysis_assim_extend/
PRE analysis_assim_extend_no_da/
PRE analysis_assim_hawaii/
PRE analysis_assim_hawaii_no_da/
PRE analysis_assim_long/
PRE analysis_assim_long_no_da/
PRE analysis_assim_no_da/
PRE analysis_assim_puertorico/
PRE analysis_assim_puertorico_no_da/
PRE forcing_analysis_assim/
PRE forcing_analysis_assim_extend/
PRE forcing_analysis_assim_hawaii/
PRE forcing_analysis_assim_puertorico/
PRE forcing_medium_range/
PRE forcing_short_range/
PRE forcing_short_range_hawaii/
PRE forcing_short_range_puertorico/
PRE long_range_mem1/
PRE long_range_mem2/
PRE long_range_mem3/
PRE long_range_mem4/
PRE medium_range_mem1/
PRE medium_range_mem2/
PRE medium_range_mem3/
PRE medium_range_mem4/
PRE medium_range_mem5/
PRE medium_range_mem6/
PRE medium_range_mem7/
PRE medium_range_no_da/
PRE short_range/
PRE short_range_hawaii/
PRE short_range_hawaii_no_da/
PRE short_range_puertorico/
PRE short_range_puertorico_no_da/
PRE usgs_timeslices/
Good idea on making it a function. I'll also try to read the source from S3 directly, as talked about here: https://gis.stackexchange.com/questions/429000/error-trying-to-open-netcdf-file-with-xarray-from-s3-bucket
https://github.com/awslabs/open-data-docs/tree/main/docs seems like a good page to bookmark!
Added timings for reading from S3 and writing to S3 and comparisons of reading NetCDF from S3 vs reading Zarr in 2ade85f.
Thanks for reviewing!
Overview
Adds a Notebook that pulls down NWM Predictions Short Term Channel Routing data and converts it to Zarr.
This is a simple conversion of one snapshot. The next step is to do this for multiple snapshots and append the data to the same Zarr file.
Checklist
nbautoexport export .
in/opt/src/notebooks
and committed the generated scripts. This is to make reviewing notebooks easier. (Note the export will happen automatically after saving notebooks from the Jupyter web app.)Testing Instructions
Closes #102