Deltares / hatyan

Harmonic tidal analysis and prediction
https://deltares.github.io/hatyan/
GNU General Public License v3.0
13 stars 2 forks source link

consider using HWLWno in `hatyan.calc_HWLW12345to12()` #311

Closed veenstrajelmer closed 1 day ago

veenstrajelmer commented 3 months ago

The computation of extremes (1/2) from extremes including aggers (1/2/3/4/5) is quite complex and slow. There is an alternative method suggested that is simpler, faster and includes all necessary extremes.

The new method has some issues though:

Example code:

import os
import pandas as pd
import matplotlib.pyplot as plt
plt.close('all')
import hatyan
import kenmerkendewaarden as kw # pip install git+https://github.com/Deltares-research/kenmerkendewaarden

# set logging level to INFO to get log messages
import logging
logging.basicConfig() # calling basicConfig is essential to set logging level for sub-modules
logging.getLogger("kenmerkendewaarden").setLevel(level="INFO")

dir_base = r'p:\11210325-005-kenmerkende-waarden\work'
dir_meas = os.path.join(dir_base,'measurements_wl_20101201_20220201')
# dir_meas = os.path.join(dir_base,'measurements_wl_18700101_20240101')

current_station = 'HOEKVHLD'

print(f'loading data for {current_station}')
data_pd_meas_all = kw.read_measurements(dir_output=dir_meas, station=current_station, extremes=False)
data_pd_HWLW_all = kw.read_measurements(dir_output=dir_meas, station=current_station, extremes=True)

# original 12345 to 12 conversion method in hatyan
data_pd_HWLW_all_12 = hatyan.calc_HWLW12345to12(data_pd_HWLW_all) #convert 12345 to 12 by taking minimum of 345 as 2 (laagste laagwater)
data_pd_HWLW_all_12 = hatyan.calc_HWLWnumbering(data_pd_HWLW_all_12)

# alternative 12345 to 12 conversion (faster, better, but calc_HWLWnumbering is not robust)
data_pd_HWLW_all_nos = hatyan.calc_HWLWnumbering(data_pd_HWLW_all) # raises "Exception: tidal wave numbering: LW numbers not always increasing" for HOEKVHLD 1870-2024-timeseries
data_pd_HWLW_all_35 = data_pd_HWLW_all_nos[data_pd_HWLW_all_nos["HWLWcode"].isin([3,5])]
data_pd_HWLW_all_35_min = data_pd_HWLW_all_35.loc[data_pd_HWLW_all_35.groupby('HWLWno').values.idxmin()]
data_pd_HWLW_all_35_min["HWLWcode"] = 2
data_pd_HWLW_all_12_new = data_pd_HWLW_all_nos[data_pd_HWLW_all_nos["HWLWcode"].isin([1,2])]
data_pd_HWLW_all_12_new = pd.concat([data_pd_HWLW_all_35_min, data_pd_HWLW_all_12_new], axis=0).sort_index()

# plot clearly shows missing lowwaters in original attempt
fig, (ax1,ax2) = hatyan.plot_timeseries(ts=data_pd_meas_all, ts_ext=data_pd_HWLW_all_12_new, ts_ext_validation=data_pd_HWLW_all_12)
ax1.set_xlim(pd.Timestamp("2012-12-18"), pd.Timestamp("2012-12-23"))

Gives: image

veenstrajelmer commented 1 day ago

At this stage it is to tricky to let this function depend on the extreme numbering. In many datasets there are incorrect duplicates and this messes up the HWLWnumbering method, leading it to fail. The hatyan.calc_HWLW12345to12() is indeed quite slow, but it is robust at least.