JuliaDynamics / TransitionsInTimeseries.jl

Transition Indicators / Early Warning Signals / Regime Shifts / Change Point Detection
MIT License
18 stars 5 forks source link

Detection of significant evolution of an indicator #16

Closed JanJereczek closed 1 year ago

JanJereczek commented 1 year ago

evolution_metrics.jl:

significance.jl:

JanJereczek commented 1 year ago

Thanks a lot for your review George! I am already noticing some benefits of our work for my own coding habits :) Most of what you say is crystal clear apart from two things:

Bullet point 2: Do you mean wrapping the following lines into a function?

wv_surr_indicator = WindowViewer(s, wv_indicator_width, wv_indicator_stride)
surr_x_indicator = map(indicator, wv_surr_indicator)
wv_surr_evolution = WindowViewer(surr_x_indicator, wv_evolution_width, wv_evolution_stride)
surr_evolution_ts = map(evolution_metric, wv_surr_evolution)

Bullet point 3: Do you mean generating the surrogates before the loop and basically saving them? I had thought of providing a vector of indicators (instead of just one as it is the case right now) and make the inner-most loop go over them. Maybe even provide a vector of evolution metrics if we don't want to use the same for all indicators. This way I wouldn't need to store the surrogates. Something like:

# Just an example of how these vectors would look like
indicators = [var, ar1]
evolution_metrics = [ridge_slope, corr_kendall]

# here some code is missing but we get the point
for i in 1:n_surro
    for k in 1:n_indicators
         s = sgen()
         surr_evolutions[i,:,k] = compute_indicator_evolution(s, indicators[k], evolution_metrics[k])
    end
end

Naming: I had the feeling that indicator_evolution() was a sensible name as we compute the evolution (either trend or change point - we need a generic name to describe both options) of an indicator (the call of map that you mention that gives the time-series but not the evolution metric yet!). I am open for suggestions though!

All the comments regarding code make perfect sense and I will embed them asap :)

Datseris commented 1 year ago

What's the case of using corr_kendall? The first case with the ridge is clear: we are computing the slope of e.g., AR1 coefficient. And we want to see a significant increase of slope. What is the correlation coefficient supposed to yield? And the correlation between what? The meta-analysis descriptor (what you call evolution metrics) are functions of 1 argument: a timeseries. What is the Kendall coefficient computing here?

Datseris commented 1 year ago

Bullet point 3: Do you mean generating the surrogates before the loop and basically saving them? I had thought of providing a vector of indicators (instead of just one as it is the case right now) and make the inner-most loop go over them. Maybe even provide a vector of evolution metrics if we don't want to use the same for all indicators. This way I wouldn't need to store the surrogates. Something like:

Yes, exactly like that.

Datseris commented 1 year ago

Bullet point 2: Do you mean wrapping the following lines into a function?

Well yes. The function doesn't care if the input is a surrogate or not; it does the same thing for both the real signal and the surrogate anyways. So you define a function once and call it twice: for the signal and the surrogate.

JanJereczek commented 1 year ago

In many publications, the kendall-tau correlation is used to measure the trend. It yields 1 for monotonously increasing time-series and -1 for monotonously decreasing one (and any value in-between if there is no monotonous behaviour). It is part of StatsBase.jl (https://juliastats.org/StatsBase.jl/stable/ranking/#StatsBase.corkendall) and for now I just used it as a placeholder. I wanted to illustrate that we might want to use the evolution metric n°1 to measure the trend of indicator A, and evolution metric n°2 to measure the trend of indicator B.

JanJereczek commented 1 year ago

Note that corrkendall(x, y=x) is by default a 1-argument function following that formula:

\tau=\frac{2}{n(n-1)} \sum \operatorname{sgn}\left(x_i-x_j\right) \operatorname{sgn}\left(y_i-y_j\right)
Datseris commented 1 year ago

okj sounds good. The "metaanalysis metric" (which is what you call "evolution metric") can be either a function or a vector of functions of equal length to the vector of indicators.

JanJereczek commented 1 year ago

Mostly done with the modifications. I feel it would be nice to have a struct defining the parameters of the meta-analysis. Something like:

struct MetaAnalysisParameters
    n_surrogates::Int
    surrogate_method::Surrogate
    rng::AbstractRNG
    wv_indicator_width::Int
    wv_indicator_stride::Int
    wv_evolution_width::Int
    wv_evolution_stride::Int
end

That way, we can pass these parameters easier from a function to an other. What do you think?

Datseris commented 1 year ago

Sounds good!

JanJereczek commented 1 year ago

Not completely done yet but not missing much! I will write here when it is ready for review :)

Datseris commented 1 year ago

ok!

JanJereczek commented 1 year ago

Ready for review! I think we should speak about naming in our next meeting. I already feel we are close to having a versatile interface :)

Datseris commented 1 year ago

I'm pushing some commits here to organize things a bit. Two general comments:

Datseris commented 1 year ago

Actually, Jan, forget what I said. The code is fine the way it is now. Here is what I propose: make sure tests pass. Then, I merge and we start working on the documentation. The documentation will allow us to see what things are overengineered and which are not.

JanJereczek commented 1 year ago

Hi George,

Thanks for your corrections! Some remarks:

After merge into main I see following things to do next:

What do you think?

Datseris commented 1 year ago

Highest priority now is the documentation, so that you can explain how you envision the package to be used. Then, we follow with your two suggestions.