When retrieving data for multiple months, the results are not sorted on time. This happens in ddl/waterwebservices also, so it is not a ddlpy problem. However, it is convenient to sort the returned dataframe on time anyway.
What I Did
import ddlpy # TODO: we require ddlpy from main/master branch (>0.1.0) >> pip install git+https://github.com/openearth/ddlpy
import datetime as dt
import matplotlib.pyplot as plt
plt.close("all")
# input parameters
start_date = dt.datetime(2019,10,24)
end_date = dt.datetime(2019,12,5)
locations = ddlpy.locations()
bool_hoedanigheid = locations['Hoedanigheid.Code'].isin(['NAP'])
bool_stations = locations.index.isin(['HOEKVHLD'])
bool_grootheid = locations['Grootheid.Code'].isin(['WATHTE'])
locs_wathte = locations.loc[bool_grootheid & bool_hoedanigheid & bool_stations]
# retrieve with ddlpy
meas_wathte = ddlpy.measurements(locs_wathte.iloc[0], start_date=start_date, end_date=end_date)
# filter measured waterlevels (drop waterlevel extremes)
meas_wathte_ts = meas_wathte.loc[meas_wathte['Groepering.code'].isin(['NVT'])]
# sort on time values # TODO: do this in ddlpy or in ddl
# meas_wathte_ts = meas_wathte_ts.sort_values("t")
fig, ax = plt.subplots()
ax.plot(meas_wathte_ts['t'], meas_wathte_ts['Meetwaarde.Waarde_Numeriek'])
fig.tight_layout()
Gives:
When directly calling waterwebservices we also get this, albeit with different cuts (probably due to the ddlpy month subsetting):
import ddlpy # TODO: we require ddlpy from main/master branch (>0.1.0) >> pip install git+https://github.com/openearth/ddlpy
import datetime as dt
import matplotlib.pyplot as plt
plt.close("all")
import requests
import pandas as pd
# input parameters
start_date = dt.datetime(2019,10,24)
end_date = dt.datetime(2019,12,5)
locations = ddlpy.locations()
bool_hoedanigheid = locations['Hoedanigheid.Code'].isin(['NAP'])
bool_stations = locations.index.isin(['HOEKVHLD'])
bool_grootheid = locations['Grootheid.Code'].isin(['WATHTE'])
locs_wathte = locations.loc[bool_grootheid & bool_hoedanigheid & bool_stations]
# direct retrieve
url_ddl = 'https://waterwebservices.rijkswaterstaat.nl/ONLINEWAARNEMINGENSERVICES_DBO/OphalenWaarnemingen'
request_ddl = {"Locatie":{"Code":locs_wathte.iloc[0].name, "X":locs_wathte.iloc[0]["X"], "Y":locs_wathte.iloc[0]["Y"]},
"AquoPlusWaarnemingMetadata":{
"AquoMetadata":{"Grootheid":{"Code":"WATHTE"},
"Hoedanigheid":{"Code":"NAP"},
"Groepering":{"Code":"NVT"}}},
"Periode":{
"Begindatumtijd":"2019-10-24T00:00:00.000+01:00",
"Einddatumtijd":"2019-12-05T00:00:00.000+01:00"}}
resp = requests.post(url_ddl, json=request_ddl)
if not resp.ok:
raise Exception('%s for %s: %s'%(resp.reason, resp.url, str(resp.text)))
result = resp.json()
if not result['Succesvol']:
raise Exception('query not succesful, Foutmelding: %s from %s'%(result['Foutmelding'],url_ddl))
for one in result['WaarnemingenLijst']:
# print(one['AquoMetadata']['Grootheid'])
# print(one['AquoMetadata']['Hoedanigheid'])
# print(one['AquoMetadata']['Groepering'])
data_ddl = pd.json_normalize(one['MetingenLijst'])
data_ddl["t"] = pd.DatetimeIndex(data_ddl['Tijdstip'])
# sort on time values # TODO: do this in ddlpy or in ddl
# data_ddl = data_ddl.sort_values("t")
fig, ax = plt.subplots()
ax.plot(data_ddl["t"], data_ddl['Meetwaarde.Waarde_Numeriek'])
ax.set_title("data from waterwebservices")
fig.tight_layout()
Description
When retrieving data for multiple months, the results are not sorted on time. This happens in ddl/waterwebservices also, so it is not a ddlpy problem. However, it is convenient to sort the returned dataframe on time anyway.
What I Did
Gives:
When directly calling waterwebservices we also get this, albeit with different cuts (probably due to the ddlpy month subsetting):
Gives: