P2P Relay Incentives? Potential for DoS?

ESultanik commented 3 years ago

What incentivizes a node to honestly participate in the P2P gossip protocol? The likelihood of being chosen as a relay is inversely proportional to the number of prior messages that node has transmitted.

https://github.com/trailofbits/x-audit-blockstack-core/blob/d35ef465e9fa2ce327a181117f8ca7933b9df075/src/net/relay.rs#L327-L361

Therefore, what prevents a malicious actor from flooding the P2P network with nodes that never actually relay any messages, thereby are almost always chosen to be relays, and thus induce a denial of service?

jcnelson commented 3 years ago

Keep in mind that this is just for inbound peers -- i.e. remote peers that have reached out to and connected to this peer, through no fault of its own. Outbound peers -- remote peers that this peer has decided to reach out to -- are not represented in this probability distribution, and relay decisions to outbound peers are made separately. So, if a malicious entity spun up, say, 10,000 peers and had them all connect to my node, my node would continue to relay messages to its outbound peers independently of these malicious inbound peers.

Now, can a malicious entity decrease the probability that other inbound peers receive messages from me by spinning up and connecting a bunch of peers to my node? Sure -- no matter what my inbound peer relay strategy is, if some inbound peers are going to receive the forwarded message, and some of them are not, then there is always a way for a malicious attacker to game the strategy to decrease the likelihood that an honest peer gets a message forwarded to it [1]. However, our inbound message relay strategy has three mitigating factors in place to prevent an attacker from outright denying inbound peers service:

The probability that an inbound peer receives a relayed message is never 0 (line 351).
Each inbound peer is going to execute a random neighbor walk and connect to other publicly-routable peers [2].
In the case of blocks and microblocks, peers will pull missing data from neighbors. They can detect missing data by inspecting the underlying Bitcoin chain, and seeing for itself whether or not a Bitcoin block sortitioned a Stacks block. From there, it can infer which neighbors have advertized that they have a copy of the block or microblock stream (tracked in src/net/inv.rs), and go and fetch it (via src/net/download.rs).

This second property means that as long as the victim peer has there's at least one non-attacked neighbor, then that neighbor will continue to relay messages to it correctly (i.e. with probability inversely proportional to how often the victim peer sends it messages that it has already seen before).

[1] The naive strategy of having each node send all its inbound peers a copy of a message is a non-starter. Because each and every node will do this, the set of nodes that act as outbound peers for a given peer will DDoS that peer -- i.e. if my node connects to 100 other nodes, then each of those 100 other nodes is going to send me a copy of every transaction and every block they learn about (RIP my EC2 instance). So, there must be an inbound peer relay strategy that doesn't do this. Ideally, the strategy would make it likely that my node only receives a copy of each in-flight message once -- any duplicates are waste (and cost me extra bandwidth money). The strategy here, which causes a node to send a message to an inbound neighbor with probability inversely proportional to how often that inbound neighbor has sent this node duplicate messages already, is meant to do this (per the comment block, an inbound peer that forwards my node data it has already seen is likely to be receiving that data from somebody else already, so my node doesn't need to be so aggressive at forwarding messages to it). We're open to a better strategy if you know of one!

[2] The neighbors the peer selects are chosen at random, via a variation of the Metropolis-Hastings random graph sampling algorithm. Specifically, neighbor discovery is a random graph walk, where the probability of stepping from one node to another is proportional to the ratio of the node's measured in-degree and out-degree in the Stacks peer graph, as well as its in-degree and out-degree in the ASs represented by the Stacks nodes. Moreover, the neighbor set is "pruned" (via src/net/prune.rs) to ensure that individual ASs are not over-represented. The underlying philosophy here is to maximize the amount of "work" an attacker needs to do to fully monopolize all neighbor connection from my node -- if my selection of neighbors is random, and favors nodes in many different ASs, then the attacker needs to insert itself in-between my node and every reachable neighbor my node knows about.

jcnelson commented 3 years ago

What incentivizes a node to honestly participate in the P2P gossip protocol?

To answer this at a high level, the short answer is "nothing." Honest nodes run the p2p gossip protocol to build up their replicas of all in-flight transactions and all blocks and microblocks (which will be pulled as well as pushed), so participation in the p2p gossip protocol is entirely self-serving. Our strategies for neighbor selection and forwarding selection are meant to make it difficult for an attacker with lots of VMs at their disposal to monopolize what data a victim peer sees; put another way, our strategies are meant to ensure that if the victim has a connection to at least one honest peer, it will continue to make progress in downloading and processing the blockchain. (N.B. this is why we don't use a DHT for neighbor discovery, for example).

diwakergupta / stacks-blockchain-tob-audit

P2P Relay Incentives? Potential for DoS? #33