JuliaDynamics / RecurrenceAnalysis.jl

Recurrence Quantification Analysis in Julia
https://juliadynamics.github.io/RecurrenceAnalysis.jl/stable/
MIT License
45 stars 12 forks source link

`RecurrenceMatrix` does not take advantage of symmetry #100

Closed heliosdrm closed 3 years ago

heliosdrm commented 3 years ago

The data of RecurrenceMatrix is of type SparseMatrixCSC{Bool,Int}. This means that, although the constructor passes a symmetric matrix with only the entries of the upper triangle, the underlying data in the resulting RecurrenceMatrix makes a new "regular" sparse matrix duplicating them on the lower triangle.

As it is in version 1.3, this could be changed without modifications in the API if the data field of RecurrenceMatrix is changed to Symmetric{Float64,SparseMatrixCSC{Bool,Int64}}. However, with the FAN threshold method (#97) recurrence matrices can also be non-symmetric, so:

Datseris commented 3 years ago

For the next major version we could consider converting RecurrenceMatrix (and maybe the other types?) into parametric types, to take advantage of potential symmetry.

Changing the type parameterization of RecurrenceMatrix is in no way breaking, we could do it as soon as there is a PR that implements it. RecurrenceMatrix can be parameterized on the type of its data field, and this would allow further optimizations, e.g. dispatching on whether the type is Symmetric{Float64,SparseMatrixCSC{Bool,Int64}} or just SparseMatrixCSC{Bool,Int}

heliosdrm commented 3 years ago

Good, I'll write another PR in parallel with #97, taking care of making them mergeable without too many conflicts (I hope).

Datseris commented 3 years ago

I have yet another suggestion @hkraemer and @heliosdrm . Pr #97 is massive. Its git difference is very large, and difficult to follow. Perhaps we can clean things up by :

  1. Removing the genuinely new FAN-related code from #97.
  2. Merging changes.
  3. Opening a new pr with the actual FAN changes.
  4. In the same PR, add the type information additions which allow us to know whether a RM is symmetric based on its type.

I know this sounds like tedious work, but there is actually only one reason I suggest it: it will make mistakes/errors/bad design decisions impossible to hide.

heliosdrm commented 3 years ago

No need to merge non-FAN related changes from #97. It took me a while to catch it, but the 13 first commits of that branch were already squashed-merged into 1.3.0. Only its five last commits have relevant changes, and they are limited to matrices.jl and a few new tests.

@hkraemer : do you want to create a new PR using the code of that branch since da507d6 (the commit that matches 1.3.0)? If that's a problem, I could do it this week.

Datseris commented 3 years ago

Ah yes, true, thanks for clarifying that!

hkraemer commented 3 years ago

yepp, I can do that.

heliosdrm commented 3 years ago

I've seen the failed attempt in #101 . I've re-tried it (I hope with better success) in #102. The trick was using git cherry-pick on the commits since da507d6

heliosdrm commented 3 years ago

After a second thought... RecurrenceMatrix does take advantage of symmetry: the underlying sparse matrix contains the data points in both triangles (upper and lower, when only one of them is strictly necessary), but the only disadvantage of this is the amount of memory needed to store them.

On the other hand, some calculations (e.g. of vertical structures) would be far more expensive if the underlying sparse matrix only contained the upper or lower triangle, and those which can take advantage of symmetry (diagonal structures) already do it - only the upper triangle is used, thanks to the fact that RecurrenceMatrix is assumed to contain always symmetric data (and JointRecurrenceMatrix too).

There would be additional difficulties if we use triangular sparse matrices wrapped into the Symmetric type to save storage: functions for sparse arrays like rowvals, nzrange, etc. cannot be applied to such types, and accessing the underlying matrix to work around that is very error-prone (e.g. if for some reason the underlying matrix is not empty in the "ignored" triangle).

All in all, I think that in the end it is not that advisable to change the definition of the types to allow Symmetric data fields. It is better to leave it as it is, with SparseMatrixCSC{Bool,Int} data in all cases.

There are still questions regarding how to deal with symmetry if the FAN method is added. But I'll write about them in the corresponding PR (#102).