Closed Datseris closed 1 year ago
@kahaaga in this post we should add the "total number of measures"
Sorry for the late reply. This already looks good! However, I'll post some comments on this tomorrow and provide a count of number of available methods before we publish the release announcement.
@kahaaga I think I'll release DynamicalSystems.jl v3 on Sunday or Monday, and I think it makes conceptual sense to announce this package first before the v3, because I intend to link this to the v3 release. If you can you may post some simple comments here, otherwise we can update the release later on.
The above are discrete entropies. If these are not your cup of tea, the package also has a generic interface for computing differential entropies.
I think this statement can be reduced to "The package also has a generic interface for computing differential entropies".
Each probability (defined by a "ProbabilityEstimator")
This should be ProbabilitiesEstimator
.
... a count of number of available methods before we publish the release announcement.
With a manual count, I get:
ProbabilitiesEstimator
s.EntropyDefinition
s. DifferentialEntropyEstimator
s that handle multivariate input dataDifferentialEntropyEstimator
s for univariate input data.In summary:
This number would be even higher if counting multiscale variations of these measures. But what has happened to the multiscale API, @Datseris? I can't find it in the most recent documentation. Is this intentional, or has it just slipped out when restructuring? I though we agreed to keep the multiscale API as is, but internally transition to a separate package for coarse-graining/sliding-window stuff later.
Ah, I see that you added a comment in the source code about the multiscale stuff not being part of the public API yet. Then we must have discussed this at length. We should probably resolve this before finalizing the paper on the software.
One can add a new type of probabilities estimator by extending a couple of simple functions (see dev docs) they immediately gain access to a plethora of functions for the corresponding entropies to the missing dispersion patterns complexity measure.
I'm not sure what the message in this last sentence is. I think there's a few words missing. I think what we want to say is
At the end of the listing of the number of available measures, I think we also should also mention that there's more functionality in progress, like the multiscale API, which will give access to multiscale variants of all the discrete measures, some of which have been explored in the literature, and most of them not
Hi @kahaaga !
78 (136) ways of estimating discrete ...
What does the number in the parenthesis means?
But what has happened to the multiscale API, @Datseris?
Yeah, just give me one and a half weeks! By that time I promise I will have initialized the "WindowedViewer.jl" package that offers functionality for viewing offer various kinds of views of some timeseries. I'm a bit overwhelmed right now with finishing DynamicalSystems.jl v3.0 and also preparing for giving a workshop for it at the MPI Evol. Biology. After I am done with that, (3rd of March) than I'll come back at the multiscale stuff.
For now, let's keep the numbers without the multiscale, it's okay. For the paper of course we will have the multiscale in!
I've added all your other comments into the post; as soon as you give the ok I'll post it on Discourse!
Yeah, just give me one and a half weeks! For now, let's keep the numbers without the multiscale, it's okay. For the paper of course we will have the multiscale in!
No worries! No need to rush this. I am also swamped until around the 10th of March, so I won't have any time to deal with this until then.
What does the number in the parenthesis means?
My bad! I don't know what happened to the formatting. It should be 78 (13 estimators * 6 entropy definitions)
for the non-normalized discrete entropies, and 65 (13 estimators * 5 entropy definitions for which entropy_maximum is defined)
for the normalized discrete entropies.
I've added all your other comments into the post; as soon as you give the ok I'll post it on Discourse!
When you've added the numbers in my previous comment, feel free to post. I also just created a Discourse user (the same username as I have here), so feel free to tag me there too!
Thanks. Can I ask a favor BTW? can you please update your github profile with a picture and your affiliation? EDIT: just a picture, affiliation is there.
Thanks. Can I ask a favor BTW? can you please update your github profile with a picture and your affiliation? EDIT: just a picture, affiliation is there.
Sure! Just give me a few minutes.
I'm drafting a release announcement here. Comments posted will be incorporated into this top level post
ComplexityMeasures.jl (Entropies.jl successor)
I'm incredibly proud to announce ComplexityMeasures.jl, which I believe is one of the most well-thought out packages in JuliaDynamics and one of the most well-thought out packages in the whole of nonlinear dynamics. (wow big statements!)
https://juliadynamics.github.io/ComplexityMeasures.jl/stable/
Intro
ComplexityMeasures.jl contains estimators for probabilities, entropies, and other complexity measures derived from observations in the context of nonlinear dynamics and complex systems. It is the successor of the previous Entropies.jl package (which was never formally announced). We believe that ComplexityMeasures.jl is the "best" (most featureful, most extendable, most tested, fastest) open source code base for computing entropies and/or complexity measures out there. We won't offer concrete proof for this statement yet, but we are writing a paper on it, and once we have a preprint I will link it here.
Content
ComplexityMeasures.jl is a practical attempt at unifying the concepts of probabilities, entropies, and complexity measures. We (@kahaaga and @datseris) have spent several months designing a composable, modular, extendable interface that is capable of computing as many different variants of "entropy" or "complexity" as one can find in the literature.
The package first defines a generic interface for estimating probabilities out of input data. Each probability (defined by a
ProbabilitiesEstimator
subtype) also defines an outcome space, and functions exist to compute the probabilities and their outcomes, as well as other convenience calculations like the size of the outcome space or the missing outcomes. There are already a plethora of probabilities estimators:CountOccurrences
Any
ValueHistogram
Vector
,StateSpaceSet
TransferOperator
Vector
,StateSpaceSet
NaiveKernel
StateSpaceSet
SymbolicPermutation
Vector
,StateSpaceSet
SymbolicWeightedPermutation
Vector
,StateSpaceSet
SymbolicAmplitudeAwarePermutation
Vector
,StateSpaceSet
SpatialSymbolicPermutation
Array
Dispersion
Vector
SpatialDispersion
Array
Diversity
Vector
WaveletOverlap
Vector
PowerSpectrum
Vector
An intermediate representation to some probabilities estimators are the Encodings, that encode elements of input data into the positive integers. Encodings allow for a large amount of code reuse as well as more possible output measures from the same code.
These probabilities can be used to compute an arbitrary number of entropies already defined in the library. The entropies themselves support an interface for different entropy estimators. It turns out, defining an entropy is one thing, but to estimate it there may be several ways. There are also a bunch of entropy definitions: Shannon, Renyi, Tsallis, Kaniadakis, Curado, StretchedExponential.
The package also has a generic interface for computing differential entropies instead of discrete ones.
On top of all that there is one more path: to compute "complexity measures", quantities related to entropies but not entropies in the formal mathematical sense.
Content in numbers
If counting everything above, there are 158 different complexity measures available out of the box.
Interface design
Perhaps the biggest victory of this package, which has never been done by any other similar code base about computing entropy-related quantities, is its design.:
Closing remarks
There's more functionality in progress, like the multiscale API, which will give access to multiscale variants of all the discrete measures, some of which have been explored in the literature, and most of them not!
We sincerely believe this package will accelerate scientific research that uses complexity measures to classify or analyze timeseries, and we welcome feature requests and pull requests on the GitHub repo!