Closed kahaaga closed 11 months ago
Merging #279 (009470c) into main (68fc7fa) will increase coverage by
0.46%
. The diff coverage is92.45%
.
@@ Coverage Diff @@
## main #279 +/- ##
==========================================
+ Coverage 85.88% 86.34% +0.46%
==========================================
Files 57 64 +7
Lines 1438 1567 +129
==========================================
+ Hits 1235 1353 +118
- Misses 203 214 +11
Files Changed | Coverage Δ | |
---|---|---|
src/ComplexityMeasures.jl | 100.00% <ø> (ø) |
|
src/core/information_measures.jl | 92.59% <0.00%> (-0.75%) |
:arrow_down: |
src/core/probabilities.jl | 87.03% <20.00%> (-6.97%) |
:arrow_down: |
src/discrete_info_estimators/schurmann.jl | 85.00% <85.00%> (ø) |
|
src/discrete_info_estimators/chao_shen.jl | 88.23% <88.23%> (ø) |
|
.../discrete_info_estimators/schurmann_generalized.jl | 92.30% <92.30%> (ø) |
|
src/core/information_functions.jl | 100.00% <100.00%> (ø) |
|
src/discrete_info_estimators/horvitz_thompson.jl | 100.00% <100.00%> (ø) |
|
src/discrete_info_estimators/jackknife.jl | 100.00% <100.00%> (ø) |
|
src/discrete_info_estimators/miller_madow.jl | 100.00% <100.00%> (ø) |
|
... and 9 more |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
So should I review this or is it still WIP? Can you resolve the git conflicts?
So should I review this or is it still WIP? Can you resolve the git conflicts?
It's still WIP. There's some nuance when it comes to a few of the estimators regarding counting frequencies, because the estimators are purposed for small sample sizes and explicitly require actual counts (the estimators use corrections based on counting singletons, doubletons, and so on, in the data). Therefore, we can't naively convert the probabilities for any estimator to some integer, because the estimators are sample-size dependent. By introducing some arbitrary conversion factor when transforming probs -> freqs
to get integers, this is ignored. Therefore, these estimators only work for probabilities obtained through actual counts (histograms, symbol frequencies), but not for probabilities obtained through normalization (e.g. wavelet or power spectrum) or some other method (transfer operator).
I'm working on #280 concurrently, which I believe will provide some deeper insight into how to solve this, and what functionality we actually need (and what to ignore).
Superseded by #285
Fixes #237. WIP. No need to review yet - changes will be made. The docs are here.
Temporarily introduced
frequencies
andfrequencies_and_outcomes
instead of editing theProbabilities
struct, just to be able to implement the actual estimators. I will go for the agreed-upon interface after I'm done with all the estimators.Shannon entropy estimators
information
interface is based on estimators, not definitions. Convenience methods should default toPlugIn
estimator with theShannon
measure.PlugIn
.Schürmann
GeneralizedSchürmann
MillerMadow
ChaoShen
HorvitzThompson
JackknifeEstimator