0xB10C / peer-observer

Tool to monitor for P2P anomalies and attacks using Bitcoin Core honeynodes
https://public.peer.observer
MIT License
10 stars 2 forks source link

Anomaly detection and alerting for interesting Bitcoin P2P metrics #13

Open 0xB10C opened 8 months ago

0xB10C commented 8 months ago

The current Grafana dashboards show a the raw numbers from Prometheus (via the metrics) tool. Anomaly detection and alerting is not yet implemented.

For example: image

Here, an anomaly could be a sudden drop in inbound peers connected to one or more peers as in https://b10c.me/observations/05-inbound-connection-flooder-down/. To detect this, a Z-score could be used. If the z-score is above a certain threshold, send an alert.

image

Here, a spike in outbound and (inbound too) address messages across all nodes could indicate an anomaly. Here a Z-score could be used. Maybe there are other possible ways to explore which can be used to detect anomalies.

This issue can be used for discussion and brainstorming.

0xB10C commented 8 months ago

Automatically detecting spy-nodes is a possibility too. Here, the anomaly is that they only listen to data from us, but never send us new transactions (or blocks). This is a bit more involved as is requires us to keep track and state of what we send to a peer and what they send us.

0xB10C commented 6 months ago

Detecting the anomalies as described in https://arxiv.org/pdf/2108.00815v1 should be possible too.

i-am-yuvi commented 6 months ago

Indeed let's see if that metrics can be used to identify the anomaly.

0xB10C commented 5 months ago

We should also monitor outbound connections. We expect to always have a minimum of 11 connections. If we have fewer for a longer timeframe or a large drop of outbound connections across multiple nodes at the same time, it's probably an anomaly.

i-am-yuvi commented 5 months ago

Indeed we can have alerts on that too!

i-am-yuvi commented 5 months ago

We should also monitor outbound connections. We expect to always have a minimum of 11 connections. If we have fewer for a longer timeframe or a large drop of outbound connections across multiple nodes at the same time, it's probably an anomaly.

Having alerts on individual nodes as well as overall could be a better idea because then we'll know which nodes are experiencing anomalies, any thoughts on that?

0xB10C commented 5 months ago

We should also monitor outbound connections. We expect to always have a minimum of 11 connections. If we have fewer for a longer timeframe or a large drop of outbound connections across multiple nodes at the same time, it's probably an anomaly.

Having alerts on individual nodes as well as overall could be a better idea because then we'll know which nodes are experiencing anomalies, any thoughts on that?

Yes, sounds good!

0xB10C commented 1 month ago

I came across this blog post How to use Prometheus to efficiently detect anomalies at scale (based on this talk https://www.youtube.com/watch?v=BTAba-Vq3xE). This looks interesting and something I want to try out.

They published prometheus recoding rules here: https://github.com/grafana/promql-anomaly-detection