Randindex overflow error

deepcharles / ruptures

ruptures: change point detection in Python

BSD 2-Clause "Simplified" License

1.56k stars 161 forks source link

Randindex overflow error #266

Closed missrg11 closed 1 year ago

missrg11 commented 1 year ago

I have noticed that when working with long signals ( >150,000 samples), an overflow occurs on the Randindex calculations, resulting to negative value for disagreement and out-of-range value for Randindex.

Not sure if this is due to system settings or general issue, however I would suggest slight adjustment to the code in order to avoid it. The issue does not occur when the order or the calculations changes.

Hence, within the Randindex function to replace this:

disagreement /= n_samples * (n_samples - 1) / 2

with this:

disagreement = (disagreement/n_samples) / (n_samples - 1) / 2

oboulant commented 1 year ago

Hi @missrg11 ,

Thx for your interest in ruptures !

Could you share here a code snippet to reproduce the issue ? I would like to investigate this a bit deeper.

Best !

missrg11 commented 1 year ago

Hi, thanks for getting back to me on this.

After investigating more on the issue, it had to do with the fact that the lists with breakpoints used as input arguments were int32 rather than int.

When the lists contain int elements, the existing code works just fine.

oboulant commented 1 year ago

Thx for the info ! Please, do not hesitate to send a code snippet so we can replicate this issue and make sure we cover every possible explanations.

Now, extrapolating on the new information you provided :

If breakpoints are indeed numpy lists of intXX dtype, then yes min and max values are governed by - 2**(XX-1)< value < 2**(XX-1)-1. If you are in int32, then -2147483648< value < 2147483647. And indeed, for signal for size ~150k, then n_samples * (n_samples - 1) / 2 is larger than 2147483647. Which may cause an overflow at the denominator.

Otherwise, if breakpoints are List of Python plain int, then you should be fine since there is no longer a limit to the value of integers (here)

oboulant commented 1 year ago

Closing this issue for now ! Please feel free to reopen if needed !