When retrieving data for kenmerkendewaarden, some extremes resulted in warnings about an unknown time format:
INFO:kenmerkendewaarden.data_retrieve:retrieving meas data (extremes=True) from DDL for DENOVBTN to measurements_wl_18700101_20240101
39%|███▉ | 60/154 [01:22<02:18, 1.48s/it]C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\Lib\site-packages\ddlpy\ddlpy.py:286: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
df["time"] = pd.to_datetime(df["Tijdstip"])
91%|█████████ | 140/154 [03:19<00:20, 1.45s/it]C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\Lib\site-packages\ddlpy\ddlpy.py:286: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
df["time"] = pd.to_datetime(df["Tijdstip"])
100%|██████████| 154/154 [03:39<00:00, 1.43s/it]
39%|███▉ | 60/154 [01:21<02:18, 1.47s/it]C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\Lib\site-packages\ddlpy\ddlpy.py:286: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
df["time"] = pd.to_datetime(df["Tijdstip"])
91%|█████████ | 140/154 [03:20<00:20, 1.46s/it]C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\Lib\site-packages\ddlpy\ddlpy.py:286: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
df["time"] = pd.to_datetime(df["Tijdstip"])
100%|██████████| 154/154 [03:40<00:00, 1.43s/it]
INFO:kenmerkendewaarden.data_retrieve:retrieving meas data (extremes=True) from DDL for STAVNSE to measurements_wl_18700101_20240101
96%|█████████▌| 148/154 [03:13<00:08, 1.45s/it]C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\Lib\site-packages\ddlpy\ddlpy.py:286: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
df["time"] = pd.to_datetime(df["Tijdstip"])
100%|██████████| 154/154 [03:21<00:00, 1.31s/it]
96%|█████████▌| 148/154 [03:16<00:09, 1.63s/it]C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\Lib\site-packages\ddlpy\ddlpy.py:286: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
df["time"] = pd.to_datetime(df["Tijdstip"])
100%|██████████| 154/154 [03:25<00:00, 1.34s/it]
INFO:kenmerkendewaarden.data_retrieve:retrieving meas data (extremes=True) from DDL for TERNZN to measurements_wl_18700101_20240101
85%|████████▌ | 131/154 [03:09<00:33, 1.44s/it]C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\Lib\site-packages\ddlpy\ddlpy.py:286: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
df["time"] = pd.to_datetime(df["Tijdstip"])
100%|██████████| 154/154 [03:42<00:00, 1.44s/it]
85%|████████▌ | 131/154 [03:13<00:34, 1.49s/it]C:\Users\veenstra\Anaconda3\envs\dfm_tools_env\Lib\site-packages\ddlpy\ddlpy.py:286: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
df["time"] = pd.to_datetime(df["Tijdstip"])
100%|██████████| 154/154 [03:46<00:00, 1.47s/it]
What I Did
A reproducible example for these stations/years gave the warnings at first, but later they were not showing up anymore.
from kenmerkendewaarden.data_retrieve import retrieve_catalog
import pandas as pd
import dateutil
import ddlpy
import logging
logging.basicConfig() # calling basicConfig is essential to set logging level for sub-modules
logging.getLogger("kenmerkendewaarden").setLevel(level="INFO")
logger = logging.getLogger(__name__)
idx_warning_dict = {"DENOVBTN":[60,140],
"STAVNSE":[148],
"TERNZN":[131],
}
_, locs_meas_ext, locs_meas_exttype = retrieve_catalog()
for station in idx_warning_dict.keys():
idx_warning_list = idx_warning_dict[station]
for idx_warning in idx_warning_list:
year_one = 1870 + idx_warning
start_date = pd.Timestamp(year_one, 1, 1)
end_date = pd.Timestamp(year_one, 2, 1)
logger.warning(f"retrieving extremes from DDL for {station} for year {year_one}")
bool_station_ext = locs_meas_ext.index.isin([station])
bool_station_exttype = locs_meas_exttype.index.isin([station])
loc_meas_ext_one = locs_meas_ext.loc[bool_station_ext]
loc_meas_exttype_one = locs_meas_exttype.loc[bool_station_exttype]
freq = dateutil.rrule.YEARLY
measurements = ddlpy.measurements(
location=loc_meas_ext_one.iloc[0],
start_date=start_date,
end_date=end_date,
freq=freq,
)
# convert extreme type to HWLWcode add extreme type and HWLcode as dataset variables
# TODO: simplify by retrieving the extreme value and type from ddl in a single request: https://github.com/Rijkswaterstaat/wm-ws-dl/issues/19
measurements_exttyp = ddlpy.measurements(
location=loc_meas_exttype_one.iloc[0],
start_date=start_date,
end_date=end_date,
freq=freq,
)
Possible solution
It seemed that adding the correct format avoided the warnings:
Description
When retrieving data for kenmerkendewaarden, some extremes resulted in warnings about an unknown time format:
What I Did
A reproducible example for these stations/years gave the warnings at first, but later they were not showing up anymore.
Possible solution It seemed that adding the correct format avoided the warnings:
However, after checking this, the warning could not be reproduced anymore. It seems useful and safe to add it anyway.