deepcharles / ruptures

ruptures: change point detection in Python
BSD 2-Clause "Simplified" License
1.56k stars 161 forks source link

how to determine the number of change points using ruptures? #257

Closed sharon1234567 closed 2 years ago

sharon1234567 commented 2 years ago

ruptures needs n_bkps as one of the input parameters. But it is usually impossile to find appropriate n_bkps without seeing the curve or when data dimension is too large to plot. So is there any way to help determine the number of change points?

deepcharles commented 2 years ago

Hi sorry for the late reply.

This is one of the most difficult questions in change-point detection. When you do not know the number of changes beforehand, you must use a penalized approach (check this article for a definition). In ruptures, all methods (except Dynp) have a pen (for "penalty") argument that you can use. For instance the following code detects mean-shifts with a penalty of 10.

# assume your signal in a variable called `signal`
algo = rpt.Pelt(model="l2").fit(signal)
result = algo.predict(pen=10)

Now the issue is to find an appropriate value for pen. This heavily depends on the type of changes and the noise level. If you are detecting mean-shifts, you can look at #4. You can also do a grid search and choose the value that best fits the signals according to you. There exists supervised approaches if you have a few manually segmentated signals, see here or here.

Hope this helps

deepcharles commented 2 years ago

Closing now. Feel free to reopen.