Open vanakema opened 2 years ago
Figured this might be a helpful repo for reference https://github.com/rob-med/awesome-TS-anomaly-detection
Thanks @vanakema for detailing out the use cases. Anomaly detection IS in our roadmap - but a few months down the line.
Curious, what sort of algos worked best for you for detecting "abnormal" values? Does a simple threshold rolling average works good enough or more advanced algos like seasonal pattern detection etc. are needed
Gitlab has written about basic anomaly detection using Prometheus rules using z-score and seasonality. https://about.gitlab.com/blog/2019/07/23/anomaly-detection-using-prometheus/
Such sort of things would be possible with SigNoz also as we plan SigNoz to be compatible with Prometheus rules and alertmanager.
We can also leverage Third Eye
This is built for Apache Pinot which an OLAP database similar to ClickHouse
Might be worth while asking the netdata team on lessons learnt applying ML to time series.
Thanks for the note @nwmcsween Do you think Netdata does a good job applying ML to time series data? Any blogs/issues where they share more about it?
@pranay01 Namaste Especially ML and alarms is the specialty of netdata. It's worth it to have a look at it. I speak from 30 years of experience with Nagios, Zabbix, Elastic, Opensearch, Influx, and many more including Netdata. Netdata is top-heavy more on *nix than on Windows and lacks otel integration. That's why I'm looking at you guys right now. π
Thanks @StefanSa - do you have relevant docs in NetData I should look at?
@pranay01 Certainly not a problem. There is a lot of reading material here, as said alerting is also well done there.
https://learn.netdata.cloud/docs/ml-and-troubleshooting/anomaly-advisor
https://learn.netdata.cloud/docs/visualizations/netdata-charts#anomaly-rate-ribbon
https://learn.netdata.cloud/docs/ml-and-troubleshooting/metric-correlations
https://www.youtube.com/watch?v=2gJ36YuW6Ko
Alerting: https://learn.netdata.cloud/docs/alerting/
Live Demo: Live-Demo
Is your feature request related to a problem?
When you have a small team, you want to know when you're app is misbehaving, with a little intervention as possible
Describe the solution you'd like
SigNoz integrates an open source anomaly detection library, to alert users if anything gets out "normal" range
Some usecase:
Describe alternatives you've considered
Really the only alternative would be manually creating alerts in Promethease or feeding SigNoz metrics into an anomaly detection library ourselves
Additional context
The DataDog WatchDog feature is great because of the automatic detection of anomalous behavior, and is really helpful when you have a small team, or a team without a dedicated SRE person, since you no longer have to know what to look for necessarily.
Thank you for your feature request β we love each and every one!