It should be possible for `Probabilities` to be multidimensional

kahaaga commented 1 year ago

In CausalityTools, I'm working with multidimensional probability mass functions (obtained from multidimensional contingency tables, which are essentially just multidimensional histograms). I can marginalize joint pmf along one or more dimensional to get 1D, 2D, 3D or whatever marginals distributions. When marginalizing out all but one dimension, I'm left with what is essentially a vector that I can wrap inProbabilities. But I also need two-dimensional and three-dimensional marginals. It would be nice if these higher-dimensional marginal distributions also could be represented by Probabilities.

I can manage fine without at the moment, because it's only used internally, but it would be nice to document that we're using the same machinery across packages.

Implementation strategy

I think it should just be a matter of defining

struct Probabilities{T, N} <: AbstractArray{T, N}

instead of having Probabilities subtupe vector. The sum of the higher-dimensional marginals would always be 1, so nothing changes, except it can be indexed as p[i, j, k, ...] instead of just p[i].

Note that one could always flatten a multidimensional vector to a vector and wrap it with Probabilities after the fact, but then keeping track of indices gets much more messy when doing triple or quadruple loops with some elaborate indexing on the probabilities.

Would this be a breaking change?

Datseris commented 1 year ago

yes in doing this, no it isn't a breaking change. We state that Probabilities is a wrapper of an Array, and most typically a Vector.

kahaaga commented 1 year ago

Fixed in #241

JuliaDynamics / ComplexityMeasures.jl

It should be possible for `Probabilities` to be multidimensional #240

Implementation strategy