kLabUM / rrcf

🌲 Implementation of the Robust Random Cut Forest algorithm for anomaly detection on streams
https://klabum.github.io/rrcf/
MIT License
495 stars 112 forks source link

Streaming Data - calling a function when anomaly is detected #66

Open vinura opened 4 years ago

vinura commented 4 years ago

I am using the streaming mode of rrcf to detect anomalies from data generated by a sensor. I was able to implement it. Yet is there a built-in function to classify data instead of looking at the graph to figure out the anomaly? (simply I want to call a python custom function when there is an anomaly).

I can use a threshold value for the codisp value and call the function that I want if there is an anomaly. However, I have around 40 sensor inputs which have different data patterns. So is codisp value is a relative value to the one unique case or can I use a general threshold for this?

mdbartos commented 4 years ago

Ultimately you will need some kind of threshold test on CoDisp that will be application-dependent. Using a percentile score is a pretty reliable approach.

To answer the second part, I would need more information about the format of your data points, and what information each of them include.

vinura commented 4 years ago

Say it is coming from water pressure number from a pipeline. Each pipe has different water pressure. I get the out as a number when I sent a command to the sensor. So can I use a global threshold for coDisp ? or Should it be unique to the pipe ?

So I idea I have is to initiate new object for each pipe. But could the threshold value for codisp be global or unique to the instance? If it is unique to the instance, How can I figure out a way to use different codisp values for all 40 pipes.

Also the reason that i chose this algorithm was it can detect anomalies of streaming data. The pressure of the pipes changes according to the time of the day. So the rrcf detect those anomalies without a problem. However, this threshold thing is hard to figure out. I hope it is a global one :D .