JuliaDynamics / ComplexityMeasures.jl

Estimators for probabilities, entropies, and other complexity measures derived from data in the context of nonlinear dynamics and complex systems
MIT License
48 stars 11 forks source link

Finish devdocs, deprecations, and exported names #336

Closed Datseris closed 6 months ago

Datseris commented 7 months ago

@kahaaga I am having trouble understanding why the method

            cts, Ωobs = allcounts_and_outcomes(Shrinkage(), outcomemodel, x)

exists and is tested. How can a probabilities estimator be used as an input to the counts function? This doesn't make sense to me. I really can't imagine what this function call is supposed to do.

Datseris commented 7 months ago

I'll go ahead and remove this function because it doesn't make conceptual sense.

Datseris commented 7 months ago

I'll wait a comment before I change this.

Datseris commented 7 months ago

@kahaaga are you around to reply here?

kahaaga commented 7 months ago

@Datseris I'll have a look at this tonight/tomorrow. Please hold a bit with the merge.

Datseris commented 6 months ago

@kahaaga reminder that I need your feedback here!

kahaaga commented 6 months ago

This looks good, modulo some minor changes that will occur due to #341. I think best order of business here is to merge this first, then do #341.

Just merge the latest changes from main, then this is good to squash and merge. I'll fix anything that I didn't catch here in #341.

Datseris commented 6 months ago

My point is here:

https://github.com/JuliaDynamics/ComplexityMeasures.jl/pull/336#issuecomment-1826437062

Why do these methods exist? I would like to remove such methods in this PR unless there is very good justification to keep them.

kahaaga commented 6 months ago

cts, Ωobs = allcounts_and_outcomes(Shrinkage(), outcomemodel, x)

I can't find a single call to this method for any of the ProbabilitiesEstimators on main. It isn't tested. What is tested is allcounts_and_outcomes(::OutcomeModel, x).

Why do these methods exist? I would like to remove such methods in this PR unless there is very good justification to keep them.

I'm honestly not sure why the method that has the ProbabilitiesEstimator as the first argument is there to begin with. There may have been a good argument for having them at some point, but I can't see what it would be with the code in its current state. I say remove it.

Datseris commented 6 months ago

E.g., the Shrinkage test file is full of calls to

est = Shrinkage()
            outcomemodel = os[i]

            cts, Ωobs = allcounts_and_outcomes(est, outcomemodel, x)

in any case, we agree, there is no reason to allow giving probabilities estimators to the count function so i'll remove all of these.

kahaaga commented 6 months ago
allcounts_and_outcomes(est, outcomemodel, x)

As far I can tell, there's no line in /src where allcounts_and_outcomes(est, outcomemodel, x) is called, so we can just remove it. The method was tested because it was defined, but it isn't used (anymore?).

kahaaga commented 6 months ago

AddConstant tests the same.

My comment above goes for this case too.

Datseris commented 6 months ago

@kahaaga is WeightedOrdinalPatterns a count based outcome space or not? It is define as one, because WeightedOrdinalPatterns{M,F} <: OrdinalOutcomeSpace{M} but in the test suite in the outcome_spaces.jl file, it is tested as not being count based. Which one is correct?

kahaaga commented 6 months ago

@kahaaga is WeightedOrdinalPatterns a count based outcome space or not? It is define as one, because WeightedOrdinalPatterns{M,F} <: OrdinalOutcomeSpace{M} but in the test suite in the outcome_spaces.jl file, it is tested as not being count based. Which one is correct?

It is not count-based, because when calling fasthist!(WeightedOrdinalPatterns, x) we return a normalized histogram (Vector{Float64}), not a vector of counts (Vector{<:Integer}).

It may have been a bit premature to put all the ordinal pattern based outcome spaces under the same supertype. Strictly speaking, we should have OrdinalPatterns <: CountBasedOrdinalOutcomeSpace and WeightedOrdinalPatterns <: OrdinalOutcomeSpace and AmplitudeAwareOrdinalPatterns <: OrdinalOutcomeSpace or something along those lines.

Datseris commented 6 months ago

@kahaaga I see one more case where counts uses a probabilities estimator:

function information(hest::GeneralizedSchürmann{<:Shannon}, pest::ProbabilitiesEstimator, o::OutcomeSpace, x)
    (; definition, a) = hest

    cts = counts(pest, o, x)
    N = sum(cts)

    if a isa Real
        h = digamma(N) - 1/N * sum(nᵢ * Gₙ(a, nᵢ) for nᵢ in cts)
    else
        # Assumes a[i] corresponds to cts[i]
        h = digamma(N) - 1/N * sum(nᵢ * Gₙ(aᵢ, nᵢ) for (nᵢ, aᵢ) in zip(cts, a))
    end

    # Grassberger's estimate of `h` is based on the natural logarithm, so we must convert
    # to the desired base.
    return convert_logunit(h, MathConstants.e, definition.base)
end

Here I guess the solution is to simply ignore the probabilities estimator, as GeneralizedSchürmann does not use it?

kahaaga commented 6 months ago

Here I guess the solution is to simply ignore the probabilities estimator, as GeneralizedSchürmann does not use it?

Yes, that should do it.

Datseris commented 6 months ago

Should we rename GeneralizedSchürmann to GeneralizedSchuermann, as it is not clear to me at least how to input umlauts using the standard Julia infrastructure: there isn't any latex completion for them.

kahaaga commented 6 months ago

Should we rename GeneralizedSchürmann to GeneralizedSchuermann, as it is not clear to me at least how to input umlauts using the standard Julia infrastructure: there isn't any latex completion for them.

Umlauts are a standard input at least on Norwegian keyboards, but that may not be the case for all languages? In any case, we can rename it.

codecov[bot] commented 6 months ago

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (d265873) 88.17% compared to head (7cdcdfe) 87.91%.

:exclamation: Current head 7cdcdfe differs from pull request most recent head 4fc1b01. Consider uploading reports for the commit 4fc1b01 to get more accurate results

Files Patch % Lines
src/core/counts.jl 85.71% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #336 +/- ## ========================================== - Coverage 88.17% 87.91% -0.26% ========================================== Files 78 78 Lines 2046 2011 -35 ========================================== - Hits 1804 1768 -36 - Misses 242 243 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Datseris commented 6 months ago

Umlauts are a standard input at least on Norwegian keyboards, but that may not be the case for all languages? In any case, we can rename it.

It isn't for UK or GR keyboards. I am happy to allow unicode in names as long as standard Julia input (i.e., tab completion) can create them, but with umlauts I don't see how that's possible. So I'll use the standard ue instead.

Datseris commented 6 months ago

woohoo this is done and all tests pass!

kahaaga commented 6 months ago

woohoo this is done and all tests pass!

Nice! Feel free to merge when CI tests pass too.