Changed API for metrics requiring initialization and for significance testing

JanJereczek commented 1 year ago

Major changes:

Some indicators (e.g. LowfreqPowerSpectrum) and change metrics (e.g. RidgeRegressionSlope) need an initialization step. This is now handled automatically within the convenience functions by init_metrics().
indicators_analysis() now allocates less memory as we incrementally compute the p-value. In the long-term: use a threaded function for significance testing from TimeseriesSurrogates.jl. This implied deleting the previous code for significance testing.

Minor changes:

LowfreqPowerSpectrum (new indicator)
PermutationEntropy (new indicator)
Most of default params defined in params.jl
Corrected windowing (was off by one index into future)
Warning for potentially bad default choice of windowing params
slidebracket(), mostly to handle the time vectors in more user-friendly way

JanJereczek commented 1 year ago

Easy initialization of pre-computable metrics

As pointed out by George, it is more intuitive to have the parameters of a metric (either indicator or change metric) being provided with the metric (in a previous version of this PR, this was handled by a separate struct Params). The user can now simply run, for instance:

metrics = RidgeRegressionSlope(lambda=0.0)

This is now possible by introducing:

An abstract type PrecomputableFunction that allows to recognize the functions that can be pre-computed.
Two different structs for the same function, as e.g. RidgeRegressionSlope <: PrecomputableFunctionand PrecomputedRidgeRegressionSlope <: Function. The former stores the user-defined parameters while the second can be directly applied on a vector to return the metric.
Initializing the latter from the former is handled within the precompute_metrics function which is possible because all the PrecomputableFunctions share the same syntax for precomputation: precompute(metric, t), with t the time vector needed for pre-computation.

Additional metrics

The LowfreqPowerSpectrum is now implemented. It relies on the aforementioned pre-computation functionality. It was added to the test suite by checking whether a low- and a high-frequency signal would return meaningful results.

The PermutationEntropy is now also implemented. It was added to the test suite in accordance to what is documented here. This is just a first test and should be replaced by something safer in the long term (George, your expertise might be required on that one as complexity measures seem to be a slightly familiar topic to you).

Significance testing

The significance testing previously relied on too many functions that were hard to properly maintain. On the long term we want to rely on TimeseriesSurrogates.SurrogateTest but for now this function is not compatible with our purpose. Therefore, incremental p-value computation has been implemented in a similar although more suited way within indicators_analysis().

Default choices

Some default choices rely on functions. They are now gathered in misc/params.jl to ease the overview of devs. Some of these functions now return warnings or infos that might be relevant for the user.

Bracketing

So far, the user only had influence on the bracketing by window-mapping last, midpoint or first over the time vector. This is now handled within slidebracket and returns infos depending on the choice. This is done because different people might have different goals with this package: online vs. offline analysis, i.e. predict vs. recognize transition. Hence, now they know whether they've chosen the right option for their purpose.

Minors

In our window-viewing routines, we had views of width + 1 so far. Now corrected to width. Docs updated and minor corrections made.

Datseris commented 1 year ago

@JanJereczek regarding your quesrtion for incorporating the time window choice into the IndicatorConfig: at the moment, we don't do this, rigjht? The user creates the time vector indepedndently. If you add the time window choice (last, mid, etc.) to the indicator config, how does the user obtain the time window?

JanJereczek commented 1 year ago

@JanJereczek regarding your quesrtion for incorporating the time window choice into the IndicatorConfig: at the moment, we don't do this, rigjht? The user creates the time vector indepedndently. If you add the time window choice (last, mid, etc.) to the indicator config, how does the user obtain the time window?

Hi @Datseris and sorry for the delayed answer. The two last weeks were simply too tight but I am back on track.

In a previous discussion you mentioned that the initialization of precomputable functions should be eased and that we should get rid of something like:

t = ...
windowparams = NamedTuples(...)
lfps = LowfreqPowerSpectrum(t[1:windowparams.width], q_lofreq = 0.1)
indicators = [lfps, var]
t_indicators = windowmap(last, t, windowparams)
indconfig = IndicatorsConfig(t, indicators, width = 300, stride = 10)

My way out of this so far was to wrap many of these things in IndicatorsConfig:

indicators = [LowfreqPowerSpectrum(q = 0.1), var]
indconfig = IndicatorsConfig(t, indicators, last, width = 300, stride = 10)

The associated time vector is computed internally based on the provided function for it (here last) and can be then obtained by calling indconfig.t_indicators.

The first option is more flexible but more verbose. The second is a two-liner that might however have a bit more limitations. I don't have strong opinions in this regard. Please tell me what you prefer.

Datseris commented 1 year ago

I prefer the second option.

JanJereczek commented 1 year ago

Hi @Datseris, as you can see the two unresolved issues so far are linked to the use of ps in the PrecomputedLowfreqPowerSpectrum. I did not get how it should be helpful but I am glad to implement it if you can explain it to me a bit more.

The rest is resolved and should be ready to review. By following the last section of the example in the docs, you should see how the interface now looks like.

PS: I noticed I had an error in the p-value computation when using :both (see indicators_analysis). It requires a bit more adaptation than :right and :left compared to the code of TimeseriesSurrogates.jl.

JuliaDynamics / TransitionsInTimeseries.jl