Syntax for multi-variable information measures

kahaaga commented 11 months ago

Hey, @Datseris!

I'm finishing up the changes in CausalityTools.jl that will occur due to this PR in ComplexityMeasures.jl. We've gone a bit back and forth on naming and syntax, so I just want to make sure we're on the same page:

CausalityTools.jl extends information from ComplexityMeasures.jl.
information(definition, est, x, y, ...) is used for any multivariate measure that depends on probability mass functions or probability densities. We don't care what type of measure it is or what it intends to quantify - the naming is just a pragmatic choice that mirrors what we did in ComplexityMeasures.jl. information can be used to compute e.g. joint entropies, various divergences, (conditional) mutual information and other other conceivable functionals of probabilities.

Yay/nay?

Datseris commented 11 months ago

yes, but I need https://github.com/JuliaDynamics/ComplexityMeasures.jl/pull/316 to be finished to really comment here.

Datseris commented 11 months ago

It is unclear to me why we have information(definition, est, x, y, ...). It should be information(est, x, y, ...) . The estimator should tell you what it estimates. How can it be otherwise? Didn't we make this decision already in ComplexityMeasures.jl? It clarifies things so much. I am currently (in v.2.10) having lots of trouble keeping track of how things work because of the too many types that go into deciding what a function does.

I would favor drastically to have the estimators reference the definition. That's also the way we decided to do it in ComplxeityMeasures.jl.

kahaaga commented 11 months ago

It is unclear to me why we have information(definition, est, x, y, ...). It should be information(est, x, y, ...) . The estimator should tell you what it estimates. How can it be otherwise? Didn't we make this decision already in ComplexityMeasures.jl? It clarifies things so much. I am currently (in v.2.10) having lots of trouble keeping track of how things work because of the too many types that go into deciding what a function does. I would favor drastically to have the estimators reference the definition. That's also the way we decided to do it in ComplexityMeasures.jl.

For information-theoretic measure definitions, this would work. However, we also have other measures that don't have estimators, for example DistanceCorrelation. We also have cross-mapping, which also currently requires measure + estimator.

To make integration with independence tests and causal inference frameworks seamless, I really want to have a single interface to deal with.

EDIT: AH! I think this is actually not a problem.

estimate always takes a first argument that is either a definition or an estimator (that contains a definition).
For measures that requires estimators, we do estimate(Estimator(definition), x, y, ...)
For measures that don't have estimators, we can just allow estimate(definition, x, y, ...)
Then estimate can be called seamlessly from independence and infer_graph, with one signature (note: estimate and information are here synonymous for info measures)

This will also require a rewrite of the cross-map code, which is some work, but totally doable.

Datseris commented 11 months ago

https://juliadynamics.github.io/ComplexityMeasures.jl/dev/devdocs/#Adding-a-new-InformationMeasureEstimator

that's what we do there, estimators reference the definition...

I am not sure how you include outcome spaces in the above though. I guess they are given as arguments to Estimator?

kahaaga commented 11 months ago

I am not sure how you include outcome spaces in the above though. I guess they are given as arguments to Estimator?

Outcome spaces complicates stuff a bit, because it depends on whether you want to discretize/encode row-wise or column-wise before computing probabilities. I have a sketch for how to solve this. I will try to merge this with an argument-to-estimator approach.

kahaaga commented 11 months ago

I think I have figured out a solution that will work for the new everything-in-the-estimator approach we discussed, @Datseris.

The signatures will be:

information::est::JointProbabilities{<:MultivariateInformationMeasure}, x...), which explicitly constructs the joint pmf and computes any measure based on a pmf.
information(est::DedicatedEstimator{<:MultivariateInformationMeasure}, x...), where DedicatedEstimator is e.g. KSG1 for MIShannon.
information(est::EntropyDecomposition{<:MultivariateInformationMeasure}, x...) for measures that can be decomposed as a combination of discrete/differential entropies. Does no bias correction.
information(est::MIDecomposition{<:MultivariateInformationMeasure}, x...) for measures that can be decomposed as a combination of mutual information terms. Does no bias correction.

The relevant structs are

"""
    EntropyDecomposition(definition::MultivariateInformationMeasure, 
        est::DifferentialInfoEstimator)
    EntropyDecomposition(definition::MultivariateInformationMeasure,
        est::DiscreteInfoEstimator,
        discretization::OutcomeSpace,
        pest::ProbabilitiesEstimator = RelativeAmount())

If `est` is a [`DifferentialInfoEstimator`](@ref), then `discretization` and `pest` 
are ignored. If `est` is a [`DiscreteInfoEstimator`](@ref), then `discretization` and a
probabilities estimator `pest` must also be provided (default to `RelativeAmount`, 
which uses naive plug-in probabilities).

## Usage

- [`information`](@ref)`(est::EntropyDecomposition, x...)`.

See also: [`MutualInformationEstimator`](@ref), [`MultivariateInformationMeasure`](@ref).
"""
struct EntropyDecomposition{D <: MultivariateInformationMeasure, E, D, P}
    definition::D # extend API from complexity measures: definition must be the first field of the info estimator.
    est::E # The estimator + measure which `definition` is decomposed into.
    discretization::D # `Nothing` if `est` is a `DifferentialInfoEstimator`.
    pest::P # `Nothing` if `est` is a `DifferentialInfoEstimator`.
end

"""
    MIDecomposition(definition::MultivariateInformationMeasure, 
        est::MutualInformationEstimator)

Estimate some multivariate information measure specified by `definition`, by decomposing
it into a combination of mutual information terms, which are estimated using `est`.

## Usage

- [`information`](@ref)`(est::MIDecomposition, x...)`.

See also: [`MutualInformationEstimator`](@ref), [`MultivariateInformationMeasure`](@ref).
"""
struct MIDecomposition{D <: MultivariateInformationMeasure, E, D, P}
    definition::D # extend API from complexity measures: definition must be the first field of the info estimator.
    est::E # The MI estimator + measure which `definition` is decomposed into.
end

"""
    JointProbabilities <: InformationMeasureEstimator
    JointProbabilities(
        definition::MultivariateInformationMeasure,
        discretization::Discretization
    )

`JointProbabilities` is a generic estimator for discrete information measures that first
discretizes/encodes the input data according to the `discretization` (typically an `OutcomeSpace`), then
constructs a contingency table of the required dimensionality (a [`Counts`](@ref) instance),
then constructs a multidimensional probability mass function (a [`Probabilities`](@ref)
instance) using plug-in estimation of probabilities (relative frequencies of counts).

Works for any outcome space that implements [`codify`](@ref).

See also: [`Counts`](@ref), [`Probabilities`](@ref), [`ProbabilitiesEstimator`](@ref),
[`OutcomeSpace`](@ref), [`DiscreteInfoEstimator`](@ref).
"""
struct JointProbabilities{M <: MultivariateInformationMeasure, O, P} <: InformationMeasureEstimator{M}
    definition::M # API from complexity measures: definition must be the first field of the infoestimator.
    discretization::O
    pest::P # Not exposed for user for now.

    function JointProbabilities(def::M, disc::D, pest = RelativeAmount()) where {M, D}
        new{M, D, typeof(pest)}(def, disc, pest)
    end
end

Yay/nay/comments?

kahaaga commented 11 months ago

If relevant in the future, and other decomposition-based estimation approach gets its own estimator. It is, for example, possible to use KL-divergence-based estimators for MI. Then, that would become the KLDivergenceDecomposition estimator.

Datseris commented 11 months ago

information::est::JointProbabilities{<:MultivariateInformationMeasure}, x...)

Is this a typo? Should it be information(est::...)?

kahaaga commented 11 months ago

information::est::JointProbabilities{<:MultivariateInformationMeasure}, x...)

Is this a typo? Should it be information(est::...)?

Yes.

Datseris commented 11 months ago

Yes, I agree with the proposed API. I believe some convenience functions need to be put in place however for common usage like we do in ComplexityMeasures.jl.

CausalityTools.jl will surely need a Tutorial page due to the amount of content included.

kahaaga commented 11 months ago

Yes, I agree with the proposed API. I believe some convenience functions need to be put in place however for common usage like we do in ComplexityMeasures.jl.

Yes, some commonly used names should have convenience functions.

CausalityTools.jl will surely need a Tutorial page due to the amount of content included.

Yep. It feels a bit like the package should have been split up into multiple smaller packages at the moment, but that's for future consideration. A tutorial will be nice to have!

Datseris commented 11 months ago

It feels a bit like the package should have been split up into multiple smaller packages at the moment, but that's for future consideration.

I am not sure. I think this is great the way it is, and easier to contribute to for the community. The package has a very specific scope, it isn't doing too many things at once. It offers several different ways to do a main thing.

Besides, I think splitting wouldn't really help users in any way. They anyways would have to learn the same workflow and hence "learn" multiple packages. The tutorial is about learning the workflow, not learning the total amount of content.

kahaaga commented 11 months ago

I am not sure. I think this is great the way it is, and easier to contribute to for the community. The package has a very specific scope, it isn't doing too many things at once. It offers several different ways to do a main thing. Besides, I think splitting wouldn't really help users in any way. They anyways would have to learn the same workflow and hence "learn" multiple packages. The tutorial is about learning the workflow, not learning the total amount of content.

Yep, that's a good point. We'll just have to ensure the tutorial is good enough for a user to understand this workflow, because can be many conceptually distinct and quite complicated steps involved at the different stages of association inference.

JuliaDynamics / Associations.jl

Syntax for multi-variable information measures #352