[WIP] Dedicated structs for recurrence matrices

Datseris commented 5 years ago

See discussions in #21 , #20

Datseris commented 5 years ago

At the moment it looks like this:

julia> R = RecurrenceMatrix(x, 0.1)
RecurrenceMatrix of size (1000, 1000) with 189740 entries:
  [1   ,    1]  =  true
  [14  ,    1]  =  true
  [15  ,    1]  =  true
  [18  ,    1]  =  true
  [23  ,    1]  =  true
  [32  ,    1]  =  true
  [34  ,    1]  =  true
  [37  ,    1]  =  true
  [38  ,    1]  =  true
  ⋮
  [948 , 1000]  =  true
  [954 , 1000]  =  true
  [960 , 1000]  =  true
  [962 , 1000]  =  true
  [973 , 1000]  =  true
  [983 , 1000]  =  true
  [992 , 1000]  =  true
  [996 , 1000]  =  true
  [1000, 1000]  =  true

julia> C = CrossRecurrenceMatrix(x,y,0.1)
CrossRecurrenceMatrix of size (1000, 1000) with 189103 entries:
  [5   ,    1]  =  true
  [6   ,    1]  =  true
  [13  ,    1]  =  true
  [17  ,    1]  =  true
  [21  ,    1]  =  true
  [30  ,    1]  =  true
  [44  ,    1]  =  true
  [46  ,    1]  =  true
  [47  ,    1]  =  true
  ⋮
  [969 , 1000]  =  true
  [975 , 1000]  =  true
  [982 , 1000]  =  true
  [985 , 1000]  =  true
  [988 , 1000]  =  true
  [989 , 1000]  =  true
  [994 , 1000]  =  true
  [997 , 1000]  =  true
  [999 , 1000]  =  true

julia> J = JointRecurrenceMatrix(x,y,0.1)
JointRecurrenceMatrix of size (1000, 1000) with 36682 entries:
  [1   ,    1]  =  true
  [14  ,    1]  =  true
  [34  ,    1]  =  true
  [95  ,    1]  =  true
  [105 ,    1]  =  true
  [138 ,    1]  =  true
  [165 ,    1]  =  true
  [248 ,    1]  =  true
  [249 ,    1]  =  true
  ⋮
  [667 , 1000]  =  true
  [687 , 1000]  =  true
  [705 , 1000]  =  true
  [727 , 1000]  =  true
  [752 , 1000]  =  true
  [881 , 1000]  =  true
  [902 , 1000]  =  true
  [929 , 1000]  =  true
  [1000, 1000]  =  true

Datseris commented 5 years ago

@heliosdrm I will have to update all the recurrencerate and co. functions. I've noticed something weird in the source: e.g. in recurrencerate:

    typeof(0.0)( (count(!iszero, x)-theiler_nz)/prod(size(x)) )

first of all, why typeof(0.0) and not Float64 ? (the latter is also not necessary as / always returns Float64)

secondly, why count(!iszero, x) ? the function nnz(x) gives the amount of nonzero elements.

Can I do those substitutions already? Also, prod(size(x)) is equivalent with length(x) for our matrices.

heliosdrm commented 5 years ago

About typeof(0.0): yes, if we are sure that it will be Float64 in any system, it's ok to replace it. I originally included them in all RQA functions to ensure that the return type was consistent. If the code ensures it, it can be removed.

recurrencerate dispatched on AbstractMatrix, not specifically on SparseMatrixCSC, thus nnz could fail. If it is changed to dispatch on the new AbstractRecurrenceMatrix type, and the underlying matrix of this type can only be sparse, it can be changed to nnz.

Datseris commented 5 years ago

Hm, @heliosdrm I must have messed up somewhere. I've reformed the core source code (no plotting / windowed yet) to use the new types. But, from the tests, these ones:

@test rqapar["DET"] == 364/426
@test rqapar["L"] == 364/50
@test rqapar["ENT"] ≈ 2.02058292

are now failing. They have close values but not exact anymore. Do you see the obvious mistake? If not I'll start searching again later. What confused me is that there are two methods for verticalhistogram, diagonalhistogram: one for sparse matrices one for normal matrices. I may have messed up there...

heliosdrm commented 5 years ago

I'll check...

heliosdrm commented 5 years ago

Ok, I found It. A bug in the method verticalhistogram that was not being hit before, but it is now. I'll fix it tomorrow.

heliosdrm commented 5 years ago

Happy new year! Fixed the issue with the histograms. I have also started to fix the windowed methods, but they are still failing because iterate is not yet defined for the new types.

Datseris commented 5 years ago

Happy new year @heliosdrm !!! Thanks for the fix! I am extending iterate now then :)

Datseris commented 5 years ago

Extending iterate was so trivial, oh man I love metaprogramming. Windowed now works with Recurrence and CrossRecurrence, but not with JointRecurrence. The problem happens here:

            ret_ex = quote
                $ex_rmx
                $ex_rmy
                JointRecurrenceMatrix($rm1 .& $rm2)
            end

where $rm1 .& $rm2 produces a BitArray instead of a SparseArray. Do you want me to extend JointRecurrenceMatrix to be able to accept BitArrays? I don't know how much this will mess up with conciseness of the code, so if we can avoid it, it may be better.

heliosdrm commented 5 years ago

I think It would be better to make rm1 .& rm2 be SparseMatrixCSC before passing it to JointRecurrenceMatrix

Datseris commented 5 years ago

I don't know how to do that... :S Due to metaprograming it tells me that SparseArrays is not defined... :S

heliosdrm commented 5 years ago

But SparseArrays is imported by RecurrenceAnalysis so RecurrenceAnalysis.SparseArrays should be available.

Datseris commented 5 years ago

I've named the field of the matrices .m so far. Should we consider making it .data instead, to be mroe intuitive with respect to the Dataset ? The underlying data structure is of course different, but maybe consistency will make memorization easier.

Datseris commented 5 years ago

re: on the SparseMatrix on @window : RecurrenceAnalysis.SparseMatrixCSC works.

But I still don't know what to do as I don't know what is the Bitarray is and how I should convert it to a sparse matrix...

heliosdrm commented 5 years ago

Try this:

       ret_ex = quote
             $ex_rmx
             $ex_rmy
             JointRecurrenceMatrix($rm1.data .& $rm2.data)
       end

This should give JointRecurrenceMatrix an argument that is a sparse array, instead of a bit array.

[EDIT: I wrote rm1.m, etc. Of course should be rm1.data if we change the nomenclature.]

Datseris commented 5 years ago

Alright, that fixed it. There is now only 2 left failing :

@windowed(rrw = recurrencerate(rmatw), width=50, step=40)
@windowed rqaw = rqa(rmatw) width=50 step=40

these two lines give method missmatch. They try to pass a SparseMatrix into recurrencerate and diagonalhistogram. I'll leave this to you only because the @windowed code is too complex for me to understand :D

heliosdrm commented 5 years ago

Ok, the windowed methods seem fixed (and I have also cleaned up a bit the code).

A remaining question is: now that RQA functions only dispatch on AbstractRecurrenceMatrix, is it still necessary to call the function for RQA entropy rqaentropy? There is no risk of confusion with other entropy methods anymore.

Datseris commented 5 years ago

I'd still vote to keep it rqaentropy. By itself entropy is vague, because there are many entropies, and each one can be computed for many quantities.

The function however is very specific: computes the Shannon entropy of the diagonal lengths, so I'd say rqaentropy is better.

(we can change this after this pr is merged)

pucicu commented 5 years ago

I would like to add here that there are different entropy measures derived from the recurrence matrix. It might be a good idea to keep this in mind when defining the names for entropies here. Some variants are:

– entropy of the (black) diagonal lines (the usually used RQA measure “entropy”) – entropy of the (black) vertical lines (not yet published but mentioned in G. Leonardi: A Method for the computation of entropy in the Recurrence Quantification Analysis of categorical time series, Physica A, 512, 824–836p. (2018)); extendable to the area distribution of block structures in RPs (see mentioned paper) – entropy of the white vertical lines (= recurrence times) which is also known as “recurrence time entropy RTE” or “recurrence period density entropy RPDE”, (M. A. Little, P. E. McSharry, S. J. Roberts, D. A. E. Costello, I. M. Moroz: Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection, BioMedical Engineering OnLine, 6(23), 1–19p. (2007)) – different possibilities to estimate K2, e.g. using the Grassberger-Procaccia algorithm, or the approach by Faure et al(P. Faure, H. Korn: A new method to estimate the Kolmogorov entropy from recurrence plots: its application to neuronal signals, Physica D, 122(1–4), 265–279p. (1998))

We are currently working on a review about these measures which will also suggest some important fine tunings of the “RQA entropy”. I will let you know as soon as our ideas can be published.

Best wishes Norbert

Am 02.01.2019 um 08:48 schrieb George Datseris notifications@github.com:

I'd still vote to keep it rqaentropy. By itself entropy is vague, because there are many entropies, and each one can be computed for many quantities.

The function however is very specific: computes the Shannon entropy of the diagonal lengths, so I'd say rqaentropy is better.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JuliaDynamics/RecurrenceAnalysis.jl/pull/25#issuecomment-450802473, or mute the thread https://github.com/notifications/unsubscribe-auth/AA7WkJWYt0icE-C7FeEojt_mYQ-23iABks5u_GRrgaJpZM4ZleHy.

Datseris commented 5 years ago

Thanks Norbert for all this useful information!

I am wondering whether this function is necessary at all. In the package ChaosTools the function genentropy computes the generalized (Renyi) entropy, which also includes the Shannon entropy.

On the other hand, I can see that it is useful to have an entropy estimate to be included in the rqa bundle function. And ChaosTools is certainly too heavy of a dependency to add it here.

On the other other hand, as Norbert mentioned, you can choose many different aspects of the RecurrenceMatrix and compute their entropy. After the aforementioned review paper is out, maybe the different entropies are too many and we can safely drop rqaentropy in favor of genentropy and simply calling diagonalhist or verticalhist. We'll see as this progresses I guess.

JuliaDynamics / RecurrenceAnalysis.jl

[WIP] Dedicated structs for recurrence matrices #25