AleoNet / snarkOS

A Decentralized Operating System for ZK Applications
http://snarkos.org
Apache License 2.0
4.06k stars 2.56k forks source link

[Bug] Malicious validator can send fake BlockResponse to block honest validators from processing messages #3315

Open elderhammer opened 2 weeks ago

elderhammer commented 2 weeks ago

🐛 Bug Report

  1. Block.Transactions limits the maximum number of transactions to 1048575 in the deserialization logic https://github.com/AleoNet/snarkVM/blob/454d555a0ee1478dce5fa1822b64525a361b6b27/ledger/block/src/transactions/bytes.rs#L29-L32
        // Ensure the number of transactions is within bounds.
        if num_txs as usize > Self::MAX_TRANSACTIONS {
            return Err(error("Failed to read transactions: too many transactions"));
        }
  2. Malicious validators can construct BlockResponse messages by filling a large number of fake transactions
  3. The victim's message processing logic is stuck because of the deserialization of a large number of fake transactions https://github.com/AleoNet/snarkOS/blob/cf83035ab79907329208a7f4e35d77e8e49d0596/node/bft/src/gateway.rs#L634-L635
  4. The victim's block synchronization may stop due to being stuck on deserialization, see: https://github.com/ProvableHQ/snarkOS/pull/6

Your Environment

snarkOS Version: cf83035ab79907329208a7f4e35d77e8e49d0596

elderhammer commented 2 weeks ago

I think there are two problems to solve:

  1. Refer to the changes in https://github.com/AleoNet/snarkOS/pull/3304 and put the deserialization work into the rayon thread
  2. Appropriately reduce MAX_TRANSACTIONS (I remember that each certificate can only have a maximum of 50 transactions)
vicsn commented 2 weeks ago

@joske can you confirm if this is an issue (and the related PR resolves it?)

joske commented 2 weeks ago

I think #3304 should avoid the node getting stuck on big blocks. That said, there's still a deadlock in sync that we haven't found yet.

I don't know about the MAX_TRANSACTIONS

elderhammer commented 2 weeks ago

I think #3304 should avoid the node getting stuck on big blocks. That said, there's still a deadlock in sync that we haven't found yet.

I don't know about the MAX_TRANSACTIONS

I've looked at change 3304, which is a change to node/router/src/inbound.rs, but the problem described there is within the BFT module: https://github.com/AleoNet/snarkOS/blob/cf83035ab79907329208a7f4e35d77e8e49d0596/node/bft/src/gateway.rs#L629-L644

The value of MAX_TRANSACTIONS is hardcoded to 1048575, which means that a fake BlockResponse may be filled with 1048575 transactions, which will make deserialization extremely time-consuming.

joske commented 2 weeks ago

it's probably a good idea to defer this deserialization to rayon in Gateway too, yes.