Deltares / ddlpy

API to Dutch Rijkswaterstaat archive (DDL, waterinfo.rws.nl) of monitoring water data
https://deltares.github.io/ddlpy/
GNU General Public License v3.0
19 stars 6 forks source link

Add yearly frequency to `ddlpy.measurements()` #94

Closed veenstrajelmer closed 4 months ago

veenstrajelmer commented 4 months ago

Description

Retrieving data from ddl with ddlpy is quite slow because of the hardcoded monthly frequency. Each requests takes quite some time, even if there is no data returned, so a yearly frequency would be much more efficient. However, it is not always possible to retrieve an entire year of 10-minute values in case of many duplicated timesteps. The maximum number of returned values by ddl is 157681, this number is sometimes exceeded as is documented in https://github.com/Rijkswaterstaat/wm-ws-dl/issues/39 for a subset of stations. This issue focusses on 10-minute WATHTE data only, but there might also be other timeseries with higher frequencies or more duplicates that also exceed this number. Even if the number is not exceeded, the ddl also sometimes raised timeout errors. It was therefore wisely chosen to set the monthly frequency as the default. However, for water level extremes (four-daily), a yeraly frequency will not cause issues but it will improve the performance significantly since the overhead is reduced with a factor 12. Also for most 10-minute timeseries a yearly frequency is fine, but this would require try-except so should not be the default.

Suggestion

Note

This feature should be used with caution, when requesting a too large dataset at once, sometimes the response is empty instead of getting a decent error message back: https://github.com/Rijkswaterstaat/wm-ws-dl/issues/40