Closed missrg11 closed 1 year ago
Hi @missrg11 ,
Thx for your interest in ruptures
!
Could you share here a code snippet to reproduce the issue ? I would like to investigate this a bit deeper.
Best !
Hi, thanks for getting back to me on this.
After investigating more on the issue, it had to do with the fact that the lists with breakpoints used as input arguments were int32
rather than int
.
When the lists contain int
elements, the existing code works just fine.
Thx for the info ! Please, do not hesitate to send a code snippet so we can replicate this issue and make sure we cover every possible explanations.
Now, extrapolating on the new information you provided :
If breakpoints are indeed numpy lists of intXX
dtype
, then yes min
and max
values are governed by - 2**(XX-1)< value < 2**(XX-1)-1
.
If you are in int32
, then -2147483648< value < 2147483647
. And indeed, for signal for size ~150k, then n_samples * (n_samples - 1) / 2
is larger than 2147483647. Which may cause an overflow at the denominator.
Otherwise, if breakpoints are List
of Python plain int
, then you should be fine since there is no longer a limit to the value of integers (here)
Closing this issue for now ! Please feel free to reopen if needed !
I have noticed that when working with long signals ( >150,000 samples), an overflow occurs on the Randindex calculations, resulting to negative value for disagreement and out-of-range value for Randindex.
Not sure if this is due to system settings or general issue, however I would suggest slight adjustment to the code in order to avoid it. The issue does not occur when the order or the calculations changes.
Hence, within the Randindex function to replace this:
disagreement /= n_samples * (n_samples - 1) / 2
with this:
disagreement = (disagreement/n_samples) / (n_samples - 1) / 2