Generalize transfer operator

rusandris commented 2 months ago

Right now the transfer operator works only on an AbstractBinning outcome space. It would be nice to have a more general notion of the transfer operator so that one could compute the transferoperator regardless of which outcome space was used.

This way transferoperator would become something that operates on a series of outcomes (symbolic trajectories) instead of being itself an outcome space. At least this is the way we did it in StateTransitionNetworks.jl .

Let me know if this make sense.

kahaaga commented 2 months ago

Hey, @rusandris! Thanks for bringing this up. Your suggestion makes perfect sense. There is no reason that the transfer operator should operate on binned outcome spaces only. The reason I never implemented a generic version of transferoperator is that the implementation that's currently here was developed as part of one of our research papers, where we explicitly developed new (triangulation-based) binning approaches.

If designed cleverly, this should work on any outcome space. My initial thought is that the TransferOperator struct should be a probabilities estimator, not an outcome space:

probabilities(pest::TransferOperator, o::OutcomeSpace, x) estimates probabilities over the outcomes over x constructed according to o using an approximation of the transfer operator, from which we can easily derive the stationary probabilities. For enough samples, probabilities(est::TransferOperator, o::OutcomeSpace, x) should then be roughly equivalent to probabilities(est::TransferOperator, o::OutcomeSpace, x).

transferoperator(o::OutcomeSpace, x) computes the approximation to the transfer matrix (so that it can be used elsewhere), which is called by probabilities(est::TransferOperator, o::OutcomeSpace, x), with the additional step of estimating the probabilities.

This way, it will automagically work for any new implemented outcome space with no additional effort from the user.

Does that make sense?

kahaaga commented 2 months ago

Currently, estimation keywords to TransferOperator for the binning approach is stored in the TransferOperator struct. I guess we can just make dedicated TransferOperatorBinningKeywords (with a much better name) struct, or other keyword structs for other methods if necessary, that are given to TransferOperator. That way, we simply dispatch on the estimation parameters if basic estimation isn't sufficient. Not sure if this makes sense... Will have to give it some thought!

Datseris commented 2 months ago

This way transferoperator would become something that operates on a series of outcomes (symbolic trajectories) instead of being itself an outcome space

Yes. We first use codify and then pass the "symbols timeseries" into the "actual" transfer operator estimation.

For enough samples, probabilities(est::TransferOperator, o::OutcomeSpace, x) should then be roughly equivalent to probabilities(est::TransferOperator, o::OutcomeSpace, x).

Is there a typo here? Both things are the same.

@kahaaga I don't think we need to complicate anything here. TransferOperator receives as input any OutcomeSpace. Then it calls codify and then it uses then symbolized timeseries to estimate the transfer matrix. No need for any additional constructs like TransferOperatorBinningKeywords. For all outcome spaces their arguments are given to the outcome space construction itself.

kahaaga commented 2 months ago

For enough samples, probabilities(est::TransferOperator, o::OutcomeSpace, x) should then be roughly equivalent to probabilities(est::TransferOperator, o::OutcomeSpace, x)

Yes, it should be probabilities(est::TransferOperator{ValueBinning}, o::OutcomeSpace, x) roughly equivalent to probabilities(est::RelativeAmount, o::ValueBinning, x), with equality in the limit of infinitely many samples

No need for any additional constructs like TransferOperatorBinningKeywords

The transfer operator already has configurable keywords that go beyond those specified in ValueBinning (regarding e.g. boundary conditions and number of iterations for inferring the stationary distribution). These keyword have nothing to do with the binning in other contexts. I'd say introducing keywords at the level of ValueBinning that have nothing to do with regular binning is also additional complexity. I'd rather have it at the level of TransferOperator, where it is actually used.

kahaaga commented 2 months ago

Yes. We first use codify and then pass the "symbols timeseries" into the "actual" transfer operator estimation.

This also links to #420, where I also want the outcomes explicitly. These will be needed to preserve information about visitors that would otherwise be lost if simply passing the symbolic time series alone to the transfer operator estimator

Datseris commented 2 months ago

Aaah now I understand. Okay sure, but I would argue this could be a normal keyword(s) for TransferOperator. These keywords are simply ingonored for non binning outcome spaces? Actually, almost all outcome spaces are binings in the end of the day, so maybe these keywords are anyways generically valid...?

Datseris commented 2 months ago

Yes. We first use codify and then pass the "symbols timeseries" into the "actual" transfer operator estimation.

This also links to #420, where I also want the outcomes explicitly. These will be needed to preserve information about visitors that would otherwise be lost if simply passing the symbolic time series alone to the transfer operator estimator

As always with ComplexityMeasures.jl the situation is more complex than I thought :D and it doesn't have a simple clean straightforward solution... Haha!

If you are both participating in the JuliaDynamics meetings then we can discuss this there. You can add it to the agenda @kahaaga @rusandris .

kahaaga commented 2 months ago

As always with ComplexityMeasures.jl the situation is more complex than I thought :D and it doesn't have a simple clean straightforward solution... Haha!

We can always create a SimpleMeasures.jl package for anything that doesn't belong here 🌝

kahaaga commented 2 months ago

If you are both participating in the JuliaDynamics meetings then we can discuss this there. You can add it to the agenda @kahaaga @rusandris .

I think I'll make the meeting, so let's do!

kahaaga commented 2 months ago

Another thing to think about: should there be an equivalent to allprobabilities (i.e. all outcomes are always included during estimation) for transferoperator (perhaps just keyword argument), so that we can map between a distribution constructed using allprobabilities(::RelativeAmount, ...) and allprobabilities(::TransferOperator, ...)?

rusandris commented 2 months ago

Yes. We first use codify and then pass the "symbols timeseries" into the "actual" transfer operator estimation.

I guess it's a good idea to have both transferoperator(o::OutcomeSpace, x;kwargs...) and transferoperator(s::Vector{<:Integer};kwargs...) around. The first could call the second method maybe?

This also links to https://github.com/JuliaDynamics/ComplexityMeasures.jl/issues/420, where I also want the outcomes explicitly.

And transferoperator(o::OutcomeSpace, x;kwargs...) could have a keyword argument return_outcomes (maybe with a better name) to know when to call the codify variant that returns more information about the encoding as mentioned in #420 .

If you are both participating in the JuliaDynamics meetings then we can discuss this there. You can add it to the agenda @kahaaga @rusandris .

Great idea! We can talk about this in detail at the meetings

Datseris commented 2 months ago

And transferoperator(o::OutcomeSpace, x;kwargs...) could have a keyword argument return_outcomes (maybe with a better name) to know when to call the codify variant that returns more information about the encoding as mentioned in #420 .

Unfortunately I don't agree with this approach. I believe it is not a good design for the Julia language to change the return type based on keywords due to the fundamental type instability this creates. I also don't think it is a good software design principle in general, although this is more subjective. I think we would need two different functions, one that returns both and one that is as now.

rusandris commented 2 months ago

I think we would need two different functions, one that returns both and one that is as now.

Agreed. Two separate methods is the way to go, although we might need to deal with code duplication then (the two methods do almost exactly the same thing) but that's not that big of an issue.

kahaaga commented 2 months ago

Agreed. Two separate methods is the way to go, although we might need to deal with code duplication then (the two methods do almost exactly the same thing) but that's not that big of an issue.

It's likely possible to get around code duplication by just writing a clever internal method that is called by both exported functions.

JuliaDynamics / ComplexityMeasures.jl

Generalize transfer operator #424