Differential entropy estimators design must change.

Datseris commented 1 year ago

Our differential entropies design is flawed:

entropy([e::EntropyDefinition,] est::DifferentialEntropyEstimator, x)

it wrongly gives the impression that a mutliple dispatch can occur between eand est. This is not true because multiple dispatch is an orthogonal system. The dispatch on e shouldn't depend on est in the sense that all subtypes of e and all subtypes of est should be combinable. Otherwise one shouldn't write abstract types at all in the dispatch decleration in the documentation.

As has been proposed several times, this should be changed in favor of the definition being a parameter/field/property of the estimator itself. This makes logical sense as well: the overwhelming majority of estimators estimate a specific definition. In our documentation we currently have 11 estimators, all of which estimate only a single definition, which is the Shannon. So we really shouldn't care if in the future there would be one estimator that may estimate more than one definitions, because the majority of use cases should decide the design, not the outliers.

As such, I will make a pull request that corrects this before this package gets any user base (it doesn't have yet).

The arguments in favor of the current interface that supposedly will allow easier downstream interface usage can be resolved by defining a single internal downstream function that errors if given entropy is not Shannon, and can be extended for estimators that actually work for more than one entropy definitions.

Datseris commented 1 year ago

I have a suggestion here, since I started working on this already.

Rename entropy(esst::DiffEntropyEst, x) to:

diffentropy(est::DiffEntropEst, x)

and add the (existing) convenience dispatch:

entropy(e::EntropyDef, est::DiffEntropyEst, x) = 
check_definition_compatibility(e, est); diffentropy(est, x)

So that the differential entropy function is named differently than the discrete entropy function.

Datseris commented 1 year ago

The main reason I want to remove

entropy([e::EntropyDefinition,] est::DifferentialEntropyEstimator, x)

is because it gives the incorrect/missleading orthogonality impression regarding e and est. However, at least scientifically, an estimator estimates something, hence it includes this something in its own definition. An entropy estimator must surely include in its construction how to estimate the entropy, otherwise it wouldn't be an entropy estimator. Having the possiblity of compute more than one "definitions" of entropy doesn't violate this rule: it is still a property of the estimator, hence an orthogonal dispatch shouldn't be implied on it.

kahaaga commented 1 year ago

The dispatch on e shouldn't depend on est in the sense that all subtypes of e and all subtypes of est should be combinable.

I don't see why this has to be the case.

Personally, I really like the approach of defaulting to throwing an "unimplemented error" for a particular combination of measure and estimator. Having entropy(e, est::DiffEntEst) throws an error in cases that doesn't have an implementation yet is saying "the scientific community has yet to combine these approaches". If entropy(e, est::DiffEntEst) is implemented, then someone has done the effort. This approach is extendable and future-oriented (for all time), which is what I would expect from a library that offers that: a library of methods.

it wrongly gives the impression that a mutliple dispatch can occur between and est

But multiple dispatch can occur in our current paradigm, as you also say your self in your answer.

is because it gives the incorrect/missleading orthogonality impression regarding e and est. However, at least scientifically, an estimator estimates something, hence it includes this something in its own definition. An entropy estimator must surely include in its construction how to estimate the entropy, otherwise it wouldn't be an entropy estimator. Having the possiblity of compute more than one "definitions" of entropy doesn't violate this rule: it is still a property of the estimator, hence an orthogonal dispatch shouldn't be implied on it.

I think #237 is relevant here. Actually, our definition of Shannon is also an estimator of the true Shannon entropy, because it is an estimate to the true quantity from empirically estimated probabilities. But Shannon entropy (as we define it), with say a Miller-Madow correction term (see #237) is another estimator of the true Shannon entropy.

So actually, if we're being precise with terminology EntropyDefinition should be DiscreteEntropyEstimator, and we should have

entropy(e::DiscreteEntropyEstimator{CorrectionType}, est::ProbabilitiesEstimator)

# or 
entropy(e::DiscreteEntropyEstimatorWithCorrectionA, est::ProbabilitiesEstimator)
entropy(e::DiscreteEntropyEstimatorWithCorrectionB, est::ProbabilitiesEstimator)
...

EDIT: concretely, this would be:

entropy(e::Shannon{<:Union{MaximumLikelihood, MillerMadow, Jackknife, ...}}, est::ProbabilitiesEstimator)
#or
entropy(e::ShannonMaximumlikelihood, est::ProbabilitiesEstimator)
entropy(e::ShannonMillerMadow, est::ProbabilitiesEstimator)

In other words, the probabilities estimator estimates probabilities, which are given to an discrete entropy estimator, which estimates true entropy based on probabilities in some way (we currently implement the maximum likelihood estimator of Shannon entropy (i.e. naive relative frequencies)*, but there are many, many more).

kahaaga commented 1 year ago

it wrongly gives the impression that a mutliple dispatch can occur between eand est. This is not true because multiple dispatch is an orthogonal system. The dispatch on e shouldn't depend on est in the sense that all subtypes of e and all subtypes of est should be combinable. Otherwise one shouldn't write abstract types at all in the dispatch decleration in the documentation.

If not allowing the multiple dispatch approach (but with errors for unimplemented combinations) here, and if we want to be consistent across the ecosystem, I basically have to rewrite the entire API for causalitytools again. The exact same principle is being used there:

abstract type MutualInformation end
struct MIShannon end
struct MITsallisFuruichi end
...

mutualinfo(mi::MIShannon, est::ProbabilitiesEstimator)
mutualinfo(mi::MITsallisFuruichi, est::ProbabilitiesEstimator)
mutualinfo(mi::MIShannon, est::DiffEntEst)
mutualinfo(mi::MITsallisFuruichi, est::DiffEntEst)

As I said in a discussion with you else elsewhere, the idea that the probabilities estimator should hold the definition of what it estimates doesn't make sense here. The probabilities estimator is basically just an instruction on how to discretize data. It doesn't say anything about Shannon <: EntropyDefinition or MIShannon <: MutualInformation. Similarly, the DiffEntEst doesn't contain any information about the quantity it measures. It is just a method of estimating something, just like a ProbabilitiesEstimator represents a method of discretizing something.

My summary opinion

What we promise is compatibility between the measures and estimators that we publicly document that there is compatibility between. We do not promise the delivery of a codebase adhering some overarching design principle of programming languages with multiple dispatch, that may or may not be community standard, and may or may not be the best approach in our particular use case.

Compatibility requirements and a promise about adhering to a design principle may of course be combined in some sensible way, but I think we should think a bit more deeply about how to do this before blindly committing in the heat of the moment. Separate treatment and syntax for estimation of discrete quantities and estimation of continuous quantities, based on "we shouldn't base the API design on outliers" and similar arguments, when we've already agreed on and committed to a design, which we've publicly released, is such a heat-in-the-moment decision. If we do such a change for the differential case, we must also, due to the same argument, do such a change for the discrete case. But that involves another level of complexity which is clearly outlined in #237. So this needs more planning before implementing.

We did agree that after 2.0, we would stick to the design agreed upon. I think we should stick to that agreement, and let any further changes go into API 3.0.

The main reason for this is that I really, really need to get upstream stuff done, and if more stuff changes here now, I'm not sure I will be able to finish anything in time for all deadlines (workshop, first and foremost, but also application deadlines coming very soon). In particular, if the design philosophy changes on a fundamental level, then it'll be weeks if not months before CausalityTools is done.

I think the priority right now would be to get a consistent package ecosystem that works. Then, after things work, we can modify the API. I really enjoy the process of codeveloping this package, because I'm learning a lot, and it forces me to think in ways I wouldn't usually do, but right now I think we should look at the curve of incremental improvement rewards vs time spent, and just get something done :D Any philosophical/practical design issues we can easily deal with later in a major release. Major updates don't have to be slow. In fact, we could release ComplexityMeasures 3.0 in a few weeks, if we did some breaking change.

kahaaga commented 1 year ago

Meta-summary: I'm going to finalise CausalityTools v2 with respect to ComplexityMeasures v2.X, that is:

One function conditional_entropy, relative_entropy, mutualinfo,condmutualinfo, transferentropy (etc) that takes care of both discrete and continuous cases, and the estimator decides which quantity is computed.

Then, after this issue has been finally settled, we can do any required changes upstream too. But it has to be thought out more carefully first.

Datseris commented 1 year ago

We did agree that after 2.0, we would stick to the design agreed upon. I think we should stick to that agreement, and let any further changes go into API 3.0.

Alright, you are right. Unfortunately, I do not have the capacity at the moment to read all the text here, and in #237 which seems to be a pre-requisitive, but also keep all of this in mind simulatenously to actually write a reply here. We need to have a video call at some point where things are processed one by one, step by step, in real time instead of so much being processed all at one, because I admit I genuienely do not have the processing power to do this... Until then you can leave #239 unmerged and proceed with:

Meta-summary: I'm going to finalise CausalityTools v2 with respect to ComplexityMeasures v2.X, that is:

while I proceed to finish the workshop section on complexity measures and timeseries surrogates.

kahaaga commented 1 year ago

Alright, you are right. Unfortunately, I do not have the capacity at the moment to read all the text here, and in https://github.com/JuliaDynamics/ComplexityMeasures.jl/issues/237 which seems to be a pre-requisitive, but also keep all of this in mind simulatenously to actually write a reply here. We need to have a video call at some point where things are processed one by one, step by step, in real time instead of so much being processed all at one, because I admit I genuienely do not have the processing power to do this...

Ok, good! I am starting to get a bit overwhelmed too. Let's just finish up what we have, and try to use it a bit before making up any further opinions on the design. We'll keep this issue open and revisit it once things have calmed down a bit.

while I proceed to finish the workshop section on complexity measures and timeseries surrogates.

Perfect! I'm wrapping up the transfer entropy stuff as we speak.

For the workshop, what did you have planned when it comes to causality? Just using the basic transfer entropy with manual control of embedding parameters? Or do you intend to use the automated procedure for embedding parameter search too?

Datseris commented 1 year ago

Well, that being said however, given what you say in #237, Shannon, Renyi aren't entropy definitions: they are discrete entropy estimators. So we should change the supertype name Entropy to DiscreteEntropyEstimator. Which means that we have metods entropy(e::DiscreteEntropyEstimator, p::Probabilities). Which is totally fine.

Now this begs the question: since we do not have two types to dictate how to estimate the discrete entropy, why would we have two types to dictate how to estimate the continuous entropy????????

That means, if you want the Jackknife correction or whatever other correction to shannon, wouldn't you want to have a function call entropy(e::ShannonJacknife, p::Probabilities)? Isn't this one type for deciding how to compute the entropy? For me this is by far the simplest possible way to do it: by simply eradicating the concept of an "entropy definition" all together, and allowing the estimator to say what they compute.

In such a case, why would you need both an entropy definition and an entropy estimator?

Datseris commented 1 year ago

Ah, but in #237 you explicitly say "we should not have ShannonJacknife". Can we have a call tomorrow or so and clarify all of this?

kahaaga commented 1 year ago

Ah, but in https://github.com/JuliaDynamics/ComplexityMeasures.jl/issues/237 you explicitly say "we should not have ShannonJacknife". Can we have a call tomorrow or so and clarify all of this?

Can we clarify this after the workshop? I want to think this through properly before deciding on anything, and right now I don't have the capacity to both do that and finalize everything upstream.

Datseris commented 1 year ago

alright

JuliaDynamics / ComplexityMeasures.jl

Differential entropy estimators design must change. #238

My summary opinion