movementlabsxyz / movement

The Movement Network is a Move-based L2 on Ethereum.
Apache License 2.0
50 stars 48 forks source link

LIB-1 The Transaction Sorting Algorithm In RocksdbMempool Can Lead To A Series Of Issues #321

Open SA124 opened 1 month ago

SA124 commented 1 month ago

Severity: Major Status: Pending Auditor: Movebit Topic:Suzuka-only custom components Discovery Methods: Manual Review

LIB-1 The Transaction Sorting Algorithm In RocksdbMempool Can Lead To A Series Of Issues

Code Location: protocol-units/mempool/move-rocks/src/lib.rs#37

Descriptions: The sorting algorithm for transactions in RocksdbMempool is as follows:

First, transactions are sorted by timestamp (the timestamp is in 2-second increments). Then, they are sorted by sequence_number. Finally, they are sorted by transaction hash.

Screenshot 2024-08-08 at 3 08 28 PM

The light_node transaction packing algorithm works as follows: It packs 10 transactions into one block. If there are fewer than 10 transactions, it packs a block every second. This could result in the following situation: Transactions A and B belong to the same account. Transaction A has a timestamp of 12 and a sequence_number of 11. Transaction B has a timestamp of 10 and a sequence_number of 15. In this case, transaction B would be executed before transaction A. However, Aptos rules require that transaction A must be executed before transaction B. This sorting algorithm could lead to a series of unknown issues, which would require a deep analysis of the Aptos-core module to understand fully.

Suggestion: It is recommended to refer to the sorting method used by Aptos-core.

l-monninger commented 3 weeks ago

This is actually a known issue. We were comfortable, for the moment, assuming that a client should only send transactions in rapid succession to one node. And, thus that the map from sequence numbers to accounts for a given slot would be non-decreasing.

There is also a bit of fundamental undecidability here which we had yet to determine that Aptos addresses in a much better manner. That is, you can't actually know that a user won't submit a lesser sequence number at a later stage. The best you can do is compensate as best you can for typical network conditions to make it such that if they are a well-behaving client, they won't encounter these issues.

This is where the single-node interaction and slot comes in. If you are interacting with a single node appropriately via the client, you should pretty never encounter a situation where you end up with the described inversion. This is because you will have either waited for confirmation of the transaction with the lower sequence number or sent a batch that will necessarily place all sequence numbers into non-decreasing slots because they will be piped to the DA in order.

However, around the edges of slots, there could be some problems if you are not quite using the API correctly. That is, if you send transactions without waiting confirmation to the same node, you may end up with the above inversion.

Of course, you can't order by sequence number then slot, because the sequence number is generated by the client. Everybody would then be incentivized to generate new accounts to obtain low sequence numbers and prioritize their transactions.

We could potentially smooth out the slots by assigning another mapping to a slot offset by $s'(t) = s(t + \frac{1}{2} \delta)$. Then, we assign slots by $\max s(t), s'(t)$. We held off on adding this because we were comfortable with the assumptions above, but perhaps this is a good time to reconsider.

poetyellow commented 2 weeks ago
  1. Our previously assumed attack method was incorrect.
  2. However, prioritizing transactions based on timestamps for block inclusion is unreasonable. Other blockchains prioritize gas fees as the first priority. Suppose an attacker, A, sends 1,000 low-gas-fee transactions within a short period; these transactions will always be prioritized for inclusion in blocks, causing subsequent transactions to wait a long time before being processed. This would block the normal transactions in the entire network. The attacker can delay the processing time of normal transactions with minimal cost.
poetyellow commented 2 weeks ago

Our audit was based on the 030fe011da8bd7eca75fa0a37197205767bef7ec commit . I'm not sure if you are talking about the latest source code.

l-monninger commented 3 days ago

However, prioritizing transactions based on timestamps for block inclusion is unreasonable. Other blockchains prioritize gas fees as the first priority. Suppose an attacker, A, sends 1,000 low-gas-fee transactions within a short period; these transactions will always be prioritized for inclusion in blocks, causing subsequent transactions to wait a long time before being processed. This would block the normal transactions in the entire network. The attacker can delay the processing time of normal transactions with minimal cost.

The timestamps being sorted here are based on when the transaction enters a given node's mempool. It is not set by the client. A given node is simply ordering by when it first saw a transaction. It uses slots to account for common network conditions, e.g., latency for clients who are geographically more distant.

l-monninger commented 3 days ago

@poetyellow