JuliaDynamics / TransitionsInTimeseries.jl

Transition Indicators / Early Warning Signals / Regime Shifts / Change Point Detection
MIT License
18 stars 5 forks source link

Changed API for metrics requiring initialization and for significance testing #34

Closed JanJereczek closed 1 year ago

JanJereczek commented 1 year ago

Major changes:

Minor changes:

JanJereczek commented 1 year ago

Easy initialization of pre-computable metrics

As pointed out by George, it is more intuitive to have the parameters of a metric (either indicator or change metric) being provided with the metric (in a previous version of this PR, this was handled by a separate struct Params). The user can now simply run, for instance:

metrics = RidgeRegressionSlope(lambda=0.0)

This is now possible by introducing:

Additional metrics

The LowfreqPowerSpectrum is now implemented. It relies on the aforementioned pre-computation functionality. It was added to the test suite by checking whether a low- and a high-frequency signal would return meaningful results.

The PermutationEntropy is now also implemented. It was added to the test suite in accordance to what is documented here. This is just a first test and should be replaced by something safer in the long term (George, your expertise might be required on that one as complexity measures seem to be a slightly familiar topic to you).

Significance testing

The significance testing previously relied on too many functions that were hard to properly maintain. On the long term we want to rely on TimeseriesSurrogates.SurrogateTest but for now this function is not compatible with our purpose. Therefore, incremental p-value computation has been implemented in a similar although more suited way within indicators_analysis().

Default choices

Some default choices rely on functions. They are now gathered in misc/params.jl to ease the overview of devs. Some of these functions now return warnings or infos that might be relevant for the user.

Bracketing

So far, the user only had influence on the bracketing by window-mapping last, midpoint or first over the time vector. This is now handled within slidebracket and returns infos depending on the choice. This is done because different people might have different goals with this package: online vs. offline analysis, i.e. predict vs. recognize transition. Hence, now they know whether they've chosen the right option for their purpose.

Minors

In our window-viewing routines, we had views of width + 1 so far. Now corrected to width. Docs updated and minor corrections made.

Datseris commented 1 year ago

@JanJereczek regarding your quesrtion for incorporating the time window choice into the IndicatorConfig: at the moment, we don't do this, rigjht? The user creates the time vector indepedndently. If you add the time window choice (last, mid, etc.) to the indicator config, how does the user obtain the time window?

JanJereczek commented 1 year ago

@JanJereczek regarding your quesrtion for incorporating the time window choice into the IndicatorConfig: at the moment, we don't do this, rigjht? The user creates the time vector indepedndently. If you add the time window choice (last, mid, etc.) to the indicator config, how does the user obtain the time window?

Hi @Datseris and sorry for the delayed answer. The two last weeks were simply too tight but I am back on track.

In a previous discussion you mentioned that the initialization of precomputable functions should be eased and that we should get rid of something like:

t = ...
windowparams = NamedTuples(...)
lfps = LowfreqPowerSpectrum(t[1:windowparams.width], q_lofreq = 0.1)
indicators = [lfps, var]
t_indicators = windowmap(last, t, windowparams)
indconfig = IndicatorsConfig(t, indicators, width = 300, stride = 10)

My way out of this so far was to wrap many of these things in IndicatorsConfig:

indicators = [LowfreqPowerSpectrum(q = 0.1), var]
indconfig = IndicatorsConfig(t, indicators, last, width = 300, stride = 10)

The associated time vector is computed internally based on the provided function for it (here last) and can be then obtained by calling indconfig.t_indicators.

The first option is more flexible but more verbose. The second is a two-liner that might however have a bit more limitations. I don't have strong opinions in this regard. Please tell me what you prefer.

Datseris commented 1 year ago

I prefer the second option.

JanJereczek commented 1 year ago

Hi @Datseris, as you can see the two unresolved issues so far are linked to the use of ps in the PrecomputedLowfreqPowerSpectrum. I did not get how it should be helpful but I am glad to implement it if you can explain it to me a bit more.

The rest is resolved and should be ready to review. By following the last section of the example in the docs, you should see how the interface now looks like.

PS: I noticed I had an error in the p-value computation when using :both (see indicators_analysis). It requires a bit more adaptation than :right and :left compared to the code of TimeseriesSurrogates.jl.