Make detailed documentation of outlier detection approach.
Add simulation tests as explanation
detect_outliers(plot = TRUE): function should explain outliers with plots: think everyone agrees this is an important requirement.
Make first version of detect_levelshifts() based on discussion below.
Make standardized plot for level shifts.
Check "Nederlanders" approach --> see with Mathias for exact names.
We tried to get some essential aspects / define of what makes a "Level Shift".
CASE1
Mathias: location is known. Dijle-valley. Highest peak is a flood. The other two are due to heavy rainfall before summer.
CASE2
Both mean and 1st diff big change.
CASE3
Dubious due to missing data. Flag missing data?
CASE4
Too little data: not even 1 full year. Joris: do you want to detect shifts even if there is less than one year data? Cf. hydrological year: 1 April till 31 March.
CASE5
Seasonality: yes, LS: no. Andy: maybe use seasonal variation as threshold in rank approach? tsoutliers was correct here.
CASE6
Water withdrawal. This should be flagged as level shift due to water withdrawal. Mathias: broken data logger... Frederic: best make difference between shifts that remain, and shifts that return? This is no ordinary LS.
CASE7
Clear random walk: no LS. Joris: if many points were missing (due to lower measure frequency.... then this could result in a large first order difference.
CASE8
LS, but less than 1 year. Toon: minimum 2 years in total is best. This is problematic for Ilse: she has much less data...
TODO
detect_outliers(plot = TRUE)
: function should explain outliers with plots: think everyone agrees this is an important requirement.detect_levelshifts()
based on discussion below.We tried to get some essential aspects / define of what makes a "Level Shift".
CASE1 Mathias: location is known. Dijle-valley. Highest peak is a flood. The other two are due to heavy rainfall before summer.
CASE2 Both mean and 1st diff big change.
CASE3 Dubious due to missing data. Flag missing data?
CASE4 Too little data: not even 1 full year. Joris: do you want to detect shifts even if there is less than one year data? Cf. hydrological year: 1 April till 31 March.
CASE5 Seasonality: yes, LS: no. Andy: maybe use seasonal variation as threshold in rank approach? tsoutliers was correct here.
CASE6 Water withdrawal. This should be flagged as level shift due to water withdrawal. Mathias: broken data logger... Frederic: best make difference between shifts that remain, and shifts that return? This is no ordinary LS.
CASE7 Clear random walk: no LS. Joris: if many points were missing (due to lower measure frequency.... then this could result in a large first order difference.
CASE8 LS, but less than 1 year. Toon: minimum 2 years in total is best. This is problematic for Ilse: she has much less data...
CASE9 Seasonality doesn't apply here. So LS.
CASE10 No LS: clear random walk.
Notes: