status-im / nimbus-eth1

Nimbus: an Ethereum Execution Client for Resource-Restricted Devices
https://status-im.github.io/nimbus-eth1/
Apache License 2.0
566 stars 110 forks source link

RLPx out of range crash triggered by message in auth handshake #766

Open jlokier opened 3 years ago

jlokier commented 3 years ago

There is a decoding vulnerability to untrusted network data in the RLPx auth code.

While connected to Ropsten test network, Nimbus-eth1 raised this error in rlpxAccess/decodeAuthMessageEip8 and took down the process.

It would be appropriate to go through and systematically check for similar length trust issues in all input in RLPx.

The Geth version shown is interesting. The hex part of clientId is not a valid commit in the Geth repo. Is it a branched old version of Geth, in which case perhaps it's correct protocol we are incorrectly decoding? Or is it a fake version, intentionally sending out adversarial messages?

(Ignore the eth/65 mention, this is running a small patch that doesn't touch any relevant code.

TRC 2021-07-22 22:14:23.044+01:00 Received Hello                             topics="rlpx" tid=909268 file=rlpx.nim:1208 version=5 id=Geth/v1.0.0-stable-fd84c648/linux-amd64/go1.13.4
TRC 2021-07-22 22:14:23.044+01:00 devp2p handshake completed                 topics="rlpx" tid=909268 file=rlpx.nim:1219 peer=Node[18.167.87.121:49934] clientId=Geth/v1.0.0-stable-fd84c648/linux-amd64/go1.13.4
TRC 2021-07-22 22:14:23.044+01:00 DevP2P local and remote protocols          topics="rlpx" tid=909268 file=rlpx.nim:209 local=[eth/65] remote="[eth/63, eth/64, eth/65]"
TRC 2021-07-22 22:14:23.045+01:00 >> Sending eth.Status (0x00) [eth/65]      tid=909268 file=eth65.nim:49 peer=Node[18.167.87.121:49934] td=8618164 bestHash=ecd382aade186e4fea76c3a56e25563a1f7515a724f93e49a4b904bc4649a917 networkId=3 genesis=41941023680923e0fe4d74a34bdac8141f2540e3ae90623718e47d66d1ca4a2d forkHash=63760190 forkNext=1700000
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(226) nimbus
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(188) process
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(74) colonanonymous
/home/jamie/Status/nimbus-eth1/vendor/nim-eth/eth/p2p/rlpx.nim(1170) rlpxAccept
/home/jamie/Status/nimbus-eth1/vendor/nim-eth/eth/p2p/auth.nim(469) decodeAuthMessage
/home/jamie/Status/nimbus-eth1/vendor/nim-eth/eth/p2p/auth.nim(370) decodeAuthMessageEip8
/home/jamie/Status/nimbus-eth1/vendor/nimbus-build-system/vendor/Nim/lib/system/chcks.nim(32) raiseRangeErrorI
/home/jamie/Status/nimbus-eth1/vendor/nimbus-build-system/vendor/Nim/lib/system/fatal.nim(49) sysFatal
[[reraised from:
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(226) nimbus
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(188) process
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(74) colonanonymous
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(252) rlpxAccept
]]
[[reraised from:
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(226) nimbus
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(188) process
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(96) colonanonymous
]]
Error: unhandled exception: value out of range: -21 notin 0 .. 9223372036854775807 [RangeError]
mjfh commented 3 years ago

Can confirm thal on my Goerli block collection test. Reproducible event:

/home/jordan/nimbus-eth1/nimbus/nimbus.nim(264) nimbus /home/jordan/nimbus-eth1/nimbus/nimbus.nim(226) process /home/jordan/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll /home/jordan/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(74) colonanonymous /home/jordan/nimbus-eth1/vendor/nim-eth/eth/p2p/rlpx.nim(619) dispatchMessages /home/jordan/nimbus-eth1/vendor/nimbus-build-system/vendor/Nim/lib/system/chcks.nim(23) raiseIndexError2 /home/jordan/nimbus-eth1/vendor/nimbus-build-system/vendor/Nim/lib/system/fatal.nim(49) sysFatal [[reraised from: /home/jordan/nimbus-eth1/nimbus/nimbus.nim(264) nimbus /home/jordan/nimbus-eth1/nimbus/nimbus.nim(226) process /home/jordan/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll /home/jordan/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(96) colonanonymous ]] Error: unhandled exception: index 33 not in 0 .. 32 [IndexError]

with always the same index bat varying time until the error occurs (best run was to collect 500k blocks which was enough for my replay.)

jlokier commented 3 years ago

Can confirm

No, it's not the same issue. What you've reported there is a different, unrelated bug also in RLPx.

Look at the exception: It's an IndexError rather than a RangeError, and it's raised in a different area of the code, which is shown in the stacktrace: rlpx.nim(619) dispatchMessages versus rlpx.nim(1170) rlpxAccept.

These stacktrace may be confusing if you're not used to Chronos. It's good to ignore the [[reraised from:...]] section. It's just an artifact of how exceptions are processed in a proc with pragma {.async.}.

It's a useful report though, thanks, so I've filed it as issue #767.

kdeme commented 3 years ago

Looks like a check is needed on the input versus that eciesOverheadLength. This is used on some other locations too, and yes, in general the code will need systematically checking on input sizes and such. I don't think in the past much attention has gone to this in the eth1 code.