status-im / nimbus-eth2

Nim implementation of the Ethereum Beacon Chain
https://nimbus.guide
Other
516 stars 222 forks source link

[SEC] Peer-Pool - Peer reputation information is not persistent #1364

Open tintinweb opened 4 years ago

tintinweb commented 4 years ago

Description

A new peer that is not yet in the peer pool is assigned an initial peer score of 200. If a peer's score (a) is at the low mark or (b) disconnects it is removed from the peer-pool. This, however, means that a peer's score is automatically reset when the low mark is reached. It also allows a malicious peer to reset their own score by re-connecting to the peer.

https://github.com/status-im/nim-beacon-chain/blob/4140b3b9d9c58ea57bbab3299e69c8862d9063a1/beacon_chain/eth2_network.nim#L198-L203

Exploit Scenario

Peer A connects to peer B. Peer A interferes with B or is dishonest and gets downrated by B. If B's score for A hits the low watermark B will remove it from the pool. Alternatively A might disconnect, forcing B to remove it from the peer-pool.

Mitigation Recommendation

Define a strategy to persist the reputation of peers the node has interacted with. The node may choose to interact with less trusted nodes less often (acquire strategy), completely ban certain id's (significant negative score), give inhonest nodes a second chance after a certain amount of time has passed (or gradually recalculate their score an allow it to be negative). A peer-id's reputation should be checked in other domains as well if it is available.

Please verify that when a peer hits the low mark and it is removed from the peer pool in release() that the connection is also terminated. We were unable to find a call (in sync-manager) that actually disconnects from the peer once the mark is hit. The peer will certainly not be used again, but the connection appears still to be active.

References

https://github.com/status-im/nim-beacon-chain/blob/4140b3b9d9c58ea57bbab3299e69c8862d9063a1/beacon_chain/peer_pool.nim#L233-L238

https://github.com/status-im/nim-beacon-chain/blob/4140b3b9d9c58ea57bbab3299e69c8862d9063a1/beacon_chain/peer_pool.nim#L284-L287

https://github.com/status-im/nim-beacon-chain/blob/4140b3b9d9c58ea57bbab3299e69c8862d9063a1/beacon_chain/peer_pool.nim#L440-L460

cheatfate commented 3 years ago

There was some fixes which solves reconnection issue a bit. When peer's score is at the low mark it will be not only removed from PeerPool, but also disconnected and added to our SeenTable. peer's score (a) is at the low mark https://github.com/status-im/nim-beacon-chain/blob/devel/beacon_chain/beacon_node.nim#L671

  proc scoreCheck(peer: Peer): bool =
    if peer.score < PeerScoreLowLimit:
      try:
        debug "Peer score is too low, disconnecting", peer = peer,
              peer_score = peer.score, score_low_limit = PeerScoreLowLimit,
              score_high_limit = PeerScoreHighLimit
        # We do not care about result of this operation, because even if
        # disconnect operation fails, peer will still be added to SeenTable
        # and removed from PeerPool. So it will be not reused for syncing for
        # `SeenTablePenaltyError` time.
        asyncSpawn peer.disconnect(PeerScoreLow)
      except:
        discard
      false
    else:
      true

And disconnect() procedure is https://github.com/status-im/nim-beacon-chain/blob/devel/beacon_chain/eth2_network.nim#L392

proc disconnect*(peer: Peer, reason: DisconnectionReason,
                 notifyOtherPeer = false) {.async.} =
  # TODO: How should we notify the other peer?
  try:
    if peer.connectionState notin {Disconnecting, Disconnected}:
      peer.connectionState = Disconnecting
      # We adding peer in SeenTable before actual disconnect to avoid races.
      let seenTime = case reason
        of ClientShutDown:
          SeenTableTimeClientShutDown
        of IrrelevantNetwork:
          SeenTableTimeIrrelevantNetwork
        of FaultOrError:
          SeenTableTimeFaultOrError
        of PeerScoreLow:
          SeenTablePenaltyError
      peer.network.addSeen(peer.info.peerId, seenTime)
      await peer.network.switch.disconnect(peer.info.peerId)
      peer.connectionState = Disconnected
  except CatchableError as exc:
    # We do not care about exceptions in disconnection procedure.
    trace "Exception while disconnecting peer", peer = peer.info.peerId,
                                                reason = reason

So every peer which was removed with low score got banned for 60 minutes right now. https://github.com/status-im/nim-beacon-chain/blob/devel/beacon_chain/eth2_network.nim#L237-L238

tintinweb commented 3 years ago

@cheatfate If the node restarts, will it lose all score/seen data for the previously seen peers?

cheatfate commented 3 years ago

If the node restarts all the score/seen data will be lost.

But there no reason to keep it, because network performance/problems could give you very subjective vision about your peers and if you ban it in persistent manner its possible that you will ban all your shard in couple of minutes. This scoring is supposed to be used only for syncing... Because if you are not in synchronization process this peer scores will be stale. And the only place where we check and use this peer scores is syncing process. But syncing process is just a small part of node's duties.

tintinweb commented 3 years ago

(review: a0a4526c8347b12a672d3b1d333505f4572119c0 /misc-fixes)

This partially addresses the issue but I agree that scoring may be subject to network conditions and that it is a thin line to walk between avoiding to potentially ban many nodes (which is a threat by itself) vs. protecting against malicious nodes. From the perspective of a malicious node this means that they basically have two options to avoid getting banned:

So there are ways for a node to stay dishonest without getting banned. One requires the peer to immediately reconnect (with a 10mins penality on client shutdown; potentially with no penalty if it just disconnects?) before hitting the lowMark and the other depends on a way for the peer to trigger requests to improve their own score by providing some honest responses.

cheatfate commented 3 years ago

We recently changed syncing strategy as it was described here https://github.com/status-im/nimbus-eth2/issues/1525#issuecomment-707605181

disconnecting before dropping below the lowMark. the peer score is discarded and the peer will start with a fresh 200 on reconnect (incoming peers are put into the pool).

@tintinweb but you are missing the point that this peer will be added to the end of the PeerPool's list, and because SyncManager uses only 20 top scored peers from the PeerPool newly connected peer will have a small chance to be used by SyncManager worker again.

peers may make up for a -500 score penalty by providing 5 good block responses a 100 pts to stay above the lowMark. with a max score peer of 1000 a peer can provide up to 3 invalid responses (peer.score < PeerScoreLowLimit (1000)).

So as soon as this peer provides first invalid response it will be penalized and there present a big chance that it will be not used for the next iteration.