This turned out to be a bit bigger than I had intended, but it is the result of months of experimentation as well as feedback from frequent users of National Water Model output. I'll summarize changes for each component.
FileDownloader.py
Added an overwrite boolean parameter that will warn and skip similarly named files.
NWMFileProcessor.py
Updated get_dataset to accept the xarray.open_mfdatasetpaths argument. Essential coordinates (reference_time, time, and feature_id) are always returned.
NWMClient.py
Updated abstract interface to accept an nwm_feature_ids parameter to facilitate retrieval of smaller amounts of data. Unfortunately, retrieving data is still relatively slow, but this new interface accommodates the most common use-case (retrieval of time series data from a few sites).
NWMFileClient.py
Added unit_system parameter to match older nwm_client unit handling. Replaced get_dataset method with get_files that just downloads files and returns their local paths. get method works with several configurations and reference_times by default. Implements nwm_feature_ids parameter.
ParquetStore.py
Formerly ParquetCache.py, I decided to rename this module and reimplement as a MutableMapping similar to pandas.HDFStore. It can still be used as a context manager, but now it also includes key-access and inherits useful methods like get and pop from MutableMapping. As a consequence, the old get method was removed. This could live somewhere else, eventually.
NWMClientDefaults.py
Added defaults for handling units, variables, and the new ParquetStore.
UnitHandler.py
Taken verbatim from the older nwm_client. This could potentially live somewhere else (eventually).
_version.py
bumped to 7.0.0 due to breaking interface changes
Checklist
[x] PR has an informative and human-readable title
[x] PR is well outlined and documented. See #12 for an example
[?] Changes are limited to a single goal (no scope creep)
[x] Code can be automatically merged (no conflicts)
This turned out to be a bit bigger than I had intended, but it is the result of months of experimentation as well as feedback from frequent users of National Water Model output. I'll summarize changes for each component.
FileDownloader.py
Added an
overwrite
boolean parameter that will warn and skip similarly named files.NWMFileProcessor.py
Updated
get_dataset
to accept thexarray.open_mfdataset
paths
argument. Essential coordinates (reference_time
,time
, andfeature_id
) are always returned.NWMClient.py
Updated abstract interface to accept an
nwm_feature_ids
parameter to facilitate retrieval of smaller amounts of data. Unfortunately, retrieving data is still relatively slow, but this new interface accommodates the most common use-case (retrieval of time series data from a few sites).NWMFileClient.py
Added
unit_system
parameter to match oldernwm_client
unit handling. Replacedget_dataset
method withget_files
that just downloads files and returns their local paths.get
method works with severalconfigurations
andreference_times
by default. Implementsnwm_feature_ids
parameter.ParquetStore.py
Formerly
ParquetCache.py
, I decided to rename this module and reimplement as aMutableMapping
similar topandas.HDFStore
. It can still be used as a context manager, but now it also includes key-access and inherits useful methods likeget
andpop
fromMutableMapping
. As a consequence, the oldget
method was removed. This could live somewhere else, eventually.NWMClientDefaults.py
Added defaults for handling units, variables, and the new
ParquetStore
.UnitHandler.py
Taken verbatim from the older
nwm_client
. This could potentially live somewhere else (eventually)._version.py
bumped to 7.0.0 due to breaking interface changes
Checklist