CouplingAnalysis.get_nearest_neighbors() is currently only used as a helper method for CouplingAnalysis.mutual_information() and CouplingAnalysis.information_transfer() and tested indirectly through those. After having ported the underlying C method to Cython in #195, it appeared sensible to gain more confidence on its correct functionality by giving it a test of its own.
To create a test fixture and an expected result to compare to, it is essential to understand what the method is actually supposed to do. In trying that, I found that it has redundant loops and variables defined in several places that at least make it hard to read (see my commentshere; the code appears to be adapted from a more generally applicable algorithm, but has lost its wider applicability anyway due to the adaptations).
I mostly grasped its functionality by now, but still don't really understand the special role of the z dimension given to it by the above mentioned methods it's used by. Other than that, here's what so far I found CouplingAnalysis.get_nearest_neighbors() to be currently doing:
given:
$X = (x(t), y(t), z(t))$: an array of 3 timeseries with length $T$
$d_{xyz}$: an array to indicate, where each timeseries is located within the array (depending on each timeseries' dimensions)
$k$: number of nearest neighbors to look for
NOTE: the dimension of $X_i$ is 1 for all use cases within CouplingAnalysis, except for $z(t)$, which will be either left empty in mutual_information(), or can be of dimension > 1 in information_transfer()
For all times $t = 1,...,T$:
find the $k$ times $t' = 1,...,T$ where in all timeseries $X_i(t')$ is closest to $X_i(t)$,
out of all those $k$ times $t'$, find the biggest of these distances within any timeseries as $\epsilon_{max}$
then, for all timeseries $X_i = x,y,z$:
count how many times that timeseries itself is within $\epsilon_{max}$ to $X_i$ (might be more (or also less?) often then $k$-times),
although neighbors $t'$ within $x$ and $y$ are only counted, if $z$ has a neighbor at the same time $t'$
Still not sure if that's what it is supposed to be doing though. Probably the referenced papers Kraskov (2004) and Runge (2012b) should be consulted.
CouplingAnalysis.get_nearest_neighbors()
is currently only used as a helper method forCouplingAnalysis.mutual_information()
andCouplingAnalysis.information_transfer()
and tested indirectly through those. After having ported the underlying C method to Cython in #195, it appeared sensible to gain more confidence on its correct functionality by giving it a test of its own.To create a test fixture and an expected result to compare to, it is essential to understand what the method is actually supposed to do. In trying that, I found that it has redundant loops and variables defined in several places that at least make it hard to read (see my comments here; the code appears to be adapted from a more generally applicable algorithm, but has lost its wider applicability anyway due to the adaptations).
I mostly grasped its functionality by now, but still don't really understand the special role of the
z
dimension given to it by the above mentioned methods it's used by. Other than that, here's what so far I foundCouplingAnalysis.get_nearest_neighbors()
to be currently doing:Still not sure if that's what it is supposed to be doing though. Probably the referenced papers Kraskov (2004) and Runge (2012b) should be consulted.