Open cmicek1 opened 4 months ago
@cmicek1 correct, the normalise/rescale happens at the level of windows. could you argue why the other method would be more accurate, and if you want to change the code, where would you like the code to be changed, i.e., propose a working solution.
I'm absolutely very much a novice to any sort of recurrence analysis, so it's very possible my thinking about this may be incorrect, but my intuition was that if the goal of doing a window-based analysis was to be able to compare the recurrence statistics of a particular window with those of all the others, that it might make sense to rescale their respective distances by the same amount so that they are in the same frame of reference, so to speak.
I.e., if I'm interested in seeing which time points or areas of my dataset exhibit the most recurrence with any of the others, but I want to see how recurrence evolves over time, or I can only process a portion of my dataset at a time, then rescaling by the same value for all windows makes that more straightforward. In this context, if window A has a high recurrence rate, and window B has a high recurrence rate, my intuition would be that the points in A and B are relatively close together as well, but this is only really true if the distances are rescaled by the same amount (or at least with respect to the maximum distance in the entire distance matrix). If they aren't, then the points in A are relatively close, and the points in B are relatively close, but you can't make a conclusion about how the points in A are related to the ones in B.
I'm pretty sure this is just a question of what is a priority for a particular analysis: Do I care more about how points in a particular window compare with their neighbors within the same window, or do I care more about how different windows compare to each other?
A possible quick solution is to leave it up to the user, and let them rescale everything by a particular value if they feel like it. All that entails is adding a new option for rescale = 5
and then a new parameter they can supply to serve as the value to rescale by (I've attached a possible quick implementation).
crqa_user_rescale.zip
The genesis of this issue is though was that the documentation for the wincrqa rescale
and normalize
parameters don't match what the code is actually doing. The documentation for the rescale
parameter reads:
rescale
: Rescale the distance matrix; if rescale = 0 (do nothing); if rescale = 1 (mean distance of entire matrix); if rescale = 2 (maximum distance of entire matrix). if rescale = 3 (minimum distance of entire matrix). if rescale = 4 (euclidean distance of entire matrix).
The normalize parameter reads:
normalize
: Normalize the time-series; if normalize = 0 (do nothing); if normalize = 1 (Unit interval); if normalize = 2 (z-score).
So the description of these parameters should probably be updated to more clearly match the function behavior (i.e., whether they modify each window individually or the time series as a whole), regardless of what that ends up being.
@cmicek1 thank you! Those are very valid points. We are planning to add a few more features to the package in the near future, so we will make sure to address this suggestion, too.
Exactly as in the title. Even though the docstring for the function would seem to indicate otherwise, it looks like normalizing/rescaling only occurs at the window level and not the level of the entire time series. Amending crqa.R to accept provided normalization/rescale constants to apply is a possible fix.