solana-labs / solana

Web-Scale Blockchain for fast, secure, scalable, decentralized apps and marketplaces.
https://solanalabs.com
Apache License 2.0
13.19k stars 4.27k forks source link

High rate of tvu ingress traffic causes delinquency #8861

Closed leoluk closed 4 years ago

leoluk commented 4 years ago

A high rate of inbound shreds to the tvu port (~50 kpps) causes validators to fall out of sync with the network and become delinquent.

This how one of the recent TdS networks died (aided the the repair issues at the time that prevented nodes from getting back in sync, so they could be attacked one by one vs. having to target 1/3).

Still works, tested on our TdS node by replaying the same valid shred a lot of times:

image

(you can see where it switches to its own fork and goes delinquent until traffic stops)

mvines commented 4 years ago

Do you know if the same is true if the payload is an invalid shred? That would lower the bar for this attack from "anybody who gets enough stake to be in the leader schedule" to "anybody"

leoluk commented 4 years ago

Works with invalid shreds, too. Tried by replaying a SLP shred on TDS, still killed the node.

leoluk commented 4 years ago

I did not investigate how much of that is due to the TVU stalling vs. overloading the host OS.

sakridge commented 4 years ago

@leoluk can you share the hardware setup you used to produce this?

leoluk commented 4 years ago

16 vCores @ 2.3 GHz (Skylake Xeon), 64G RAM, NVMe disk

leoluk commented 4 years ago

Reproduction instructions: https://gist.github.com/leoluk/bbdcb6173ec3dd3da8b2f6ca9512b2e9