deepcharles / ruptures

ruptures: change point detection in Python
BSD 2-Clause "Simplified" License
1.54k stars 160 forks source link

Question: How do i use Ruptures to detect large data streaming? #320

Closed nntp4 closed 5 months ago

nntp4 commented 5 months ago

Description

I need to process more than 10 TB of data in Kafka clusters per day. In other words how do I use ruptures with the distributed system to process large data streaming?

tg12 commented 5 months ago

This question doesn't make much sense. It's time to close the book on this one. This is largely for Offline Change Point Detection project, but honestly, this isn't the place for it. Change point detection methods fall into two categories: online methods, which spot changes in real-time, and offline methods, which look back after all data is in. If you want to dive deeper, check out this Wikipedia page on change detection: https://en.wikipedia.org/wiki/Change_detection. Try looking for Bayesian Online Changepoint Detection if you are interested in streaming.

nntp4 commented 5 months ago

@tg12 THX your reply, I close this issue now