microsoft / AIforEarthDataSets

Notebooks and documentation for AI-for-Earth-managed datasets on Azure
https://microsoft.github.io/AIforEarthDataSets/
MIT License
285 stars 45 forks source link

Add time-series query examples to noaa-nwm-example data notebook #40

Open jameshalgren opened 1 year ago

jameshalgren commented 1 year ago

The NWM notebook currently displays interaction examples with single files from the NOAA National Water Model output data.

Many uses of the NWM data require collecting a series of multiple outputs and assembling them into a time series either for one or many points. This aggregation and cross-querying can be accomplished in a number of different ways: e.g., MultiZarr, Kerchunk, concatenated xarray datasets, direct netcdf library access, etc.

Including several examples in this notebook, along with performance statistics to show the relative advantages, especially if there is a particular advantage that can be obtained by using the data within the Azure platform specifically, would be powerful.

(I'll work on some prototypes and see if I can issue a PR...)

jameshalgren commented 1 year ago

Some possible examples here.

TomAugspurger commented 1 year ago

Thanks James! I'll try to take a look at this sometime in the next week or two. I agree it'd be great to have some examples doing this.

At some point, we'd ideally build and host those cloud-optimized indices for users. But that'll take us a while to get there.