Open perama-v opened 1 year ago
I love the use case of running transaction traces!
I'm not sure that storing the state witness for every block is the best option we have for enabling that. It sounds like it means storing duplicates of the same state, every block that it gets accessed. That might be a lot of duplicate data.
Another option is to "simply" extend the state network to be a full archive of all state from genesis.
I like to think about a theoretical transaction with index 100 in some historical block N:
---
title: Tracing transaction index 100 in block N requires preceeding intra-block transactions.
---
flowchart RL
C --> B --> A
subgraph A[block N - 1]
direction BT
subgraph aB1[Header]
direction RL
asr[State root]
end
subgraph aT1[Transaction 0]
direction TB
end
subgraph aTx[...]
direction TB
end
subgraph aT22[Transaction 22]
direction TB
at99con1s1[update contract 1, storage slot 1]
end
subgraph aT23[Transaction 23]
direction TB
at99con1s2[update contract 1, storage slot 2]
end
aT23 --> aT22 --> aTx --> aT1 --> asr
end
subgraph B[block N]
direction BT
subgraph B1[Header]
direction RL
bsr[State root]
end
subgraph T1[Transaction 0]
direction TB
t0con1s1[update contract 1, storage slot 1]
end
subgraph Tx[...]
direction TB
end
subgraph T99[Transaction 99]
direction TB
t99con1s1[update contract 1, storage slot 1]
end
subgraph T100[Transaction 100]
direction TB
t100con1s1[read contract 1, storage slot 1]
end
T100 --> T99 --> Tx --> T1 --> bsr
end
subgraph C[block N + 1]
subgraph C1[Header]
direction RL
csr[State root]
end
end
subgraph P[proof containing every state block N accesses]
direction TB
s1[contract 1, storage slot 1 value, before transaction 0]
end
P -.- B
Transaction 100 could read from a slot that was updated by any preceding transaction in that block. So every transaction (0-99) must be replayed. If you do not provide the state proof for every part of state that is accessed in that block, then you cannot be sure that the value is correct.
The state proof effectively is the database for tracing that block. It does not need to have storage slot values that were not accessed in that block. So when constructing the proof you replay a block, look for any state that was actually touched by the block and construct the proof containing all those values.
For example, contract 1 slot 2 is updated in block N - 1
. So it will be part of the state root of block N
, but it does not need to be in the block state proof for that block because it is not accessed.
I really like going directly for archive. It fits really well into the things our protocols do best, aka, sharing the load and getting more powerful when more machines join the network, and it could serve as a good adoption mechanism to engage the community to provide the necessary storage to house it.
I started making a basic tool to build a state proof test the idea. Nodes have some convenient methods that can be leveraged.
The two of interest are
(WIP)
I'm inclined to go for storing all archival state over storing transaction traces, because the archival state can be used to produce transaction traces, but is also useful beyond that for other historical EVM research.
We have an existing approach to archival state, which would be to do full storage of all historical trie data. This approach is functional, in that it allows for accessing arbitrary historical state roots, but it suffers from the inefficiencies that go with having to do merkle trie navigation to access state. I think that this is our "default" starting point for network design since we know it will work.
I also suspect that there's alternate schemes for doing this that could greatly reduce overall storage requirements and reduce the overhead for individual state lookups, but that's an open area of research that would need some :brain: work.
I really like going directly for archive. It fits really well into the things our protocols do best, aka, sharing the load and getting more powerful when more machines join the network, and it could serve as a good adoption mechanism to engage the community to provide the necessary storage to house it.
Me too -- I had assumed that full archive state was the goal, and initially approached research in that way. That goal still feels essential.
Im still speculating about a lot of stuff but, the disc space saved by storing merkle tries feels overwhelmingly worth the work to rebuild a proof if asked.
I think if we're clever about the way we use the concept of "radius", nodes could be flexible not just about the range of account addresses that it covers, but also the depth of state history that it is willing/capable of serving.
I wonder also if there is a way for nodes with overlapping radius to self organize, so that the history of accounts in their shared radius is also distributed.
Patricia tries can be weird -- but generally two Tries with the same number of leaves will be smaller than one Trie with twice as many leaves. Which in this context would mean it is easier on disc space to expand a radius across time than it would be to expand a radius to include more addresses.
A summary so far
"Tracer Configurations": The set of 4byteTracer, callTracer
, diffMode
and prestateTracer
.
"Basic Archive Traces" (BATs): For any historical block, a Portal node can respond to:
eth_debugTraceBlockByNumber
with Tracer Configurationseth_debugTraceBlockByHash
with Tracer Configurationseth_debugTraceTraceTransaction
with Tracer Configurationstrace_Transaction
(aka Parity-style trace)"Counterfactual Archive Traces" (CATs): For any historical block, a Portal node can respond to:
debug_traceCall
"Full Node History" (FNH): For any historical block, a Portal node can respond to:
eth_getBlockTransactionCountByHash
eth_getBlockTransactionCountByNumber
eth_getUncleCountByBlockHash
eth_getUncleCountByBlockNumber
eth_getBlockByHash
eth_getBlockByNumber
eth_getTransactionByHash
eth_getTransactionByBlockHashAndIndex
eth_getTransactionByBlockNumberAndIndex
eth_getTransactionReceipt
eth_getUncleByBlockHashAndIndex
eth_getUncleByBlockNumberAndIndex
"Full Node State" (FNS): For any block less than 256 blocks old, a Portal node can respond to:
eth_getBalance
eth_getStorageAt
eth_getTransactionCount
eth_getCode
eth_call
eth_estimateGas
"Archive Node State" (ANS): FNS, but for any historical block.
"Accessed State Proof" (ASP): A proof against the block state root that includes every piece of state that a block accesses during execution.
Currently specified overlay networks and their functionality:
Functionality not yet incorporated into an overlay network:
Transactions do not store post-state hashes. Therefore all tracing solutions require the execution of all preceding transactions using values proven against the block state root.
Basic Archive Traces (BATs) and Counterfactual Archive Traces (CATs) via Archive Node State (ANS).
Architecture: Extend the state overlay to blocks more than 256 blocks old. This provides Full Node State (FNS) back to genesis.
To serve BAT/CAT methods a user sends the request to their localhost node:
eth_getBalance
/eth_getCode
/eth_getStorageAt
b. continue to run transaction
c. if untouched state is required go to a)
Challenges:
Benefits:
Implied storage cost:
Basic Archive Traces (BATs) via Accessed State Proofs (ASPs) that contain only values that are required to execute a single block in isolation.
Architecture: Construct proofs for blocks once they are created. Each proof contains state that the block needed when it was created. Any state not accessed by the block is not included in the proof. Proofs are distributed on a new overlay network and are requested for local tracing.
To serve BAT methods a user sends the request to their localhost node:
Challenges:
Benefits:
Implied storage cost:
If "what happened" / Basic Archive Traces are desired they can be achieved by:
The BATs/CATs via ANS is the favoured approach.
Current research questions: Think about optimisations or clever strategies to make the implementation fast and efficient
First, this analysis seems solid and I appreciate you writing it out and getting it organized. I believe your analysis is all correct.
The main thing I'll say is that these solutions aren't mutually exclusive and that they can/could be built as layers.
If we build ANS (Archive Node State), then ASP (Accessed State Proofs) is simply a layer on top of ANS. In theory, we can build both. The only limits are how much capacity the network is able to acquire through people running nodes. At the current stage of this project, we only have the luxury of building one and thus, if we focus on one of these, it'll be an ANS solution. After that is live and working, the race is on for people to build layers on top of it that make it faster and easier to use.
Previously a historical state format was introduced via Accessed State Proofs. This post expands on the idea to mitigate some previously identified issues.
This continues to be a research discussion, not intended as a formal spec direction change.
First a note on the potential for a sudden drop in database requirements for a state model predicated on proofs.
Pre-verkle proofs, we use Merkle proofs. The data storage burden is heavy, but we provide all historical state.
Post-verkle proofs every Merkle proof stored can be replaced in-situ with verkle proofs and suddenly nodes have 90-95% reduction in storage for the proof part of the database.
Archive Node State (ANS) and Counterfactual Archive Traces (CATs) via ASP (Accessed State Proofs) plus new ASI (Accessed State Index)
"Accessed State Index" (ASI): A mapping of state keys to block numbers that they were historically accessed in. E.g,. storage slot key -> array of block numbers.
This index allows discovery of which proof-holding-peers (ASPs) have the state data for a given slot.
Accesses State Proofs (ASP) give access to historical state that was used in that block but neglect
all other state. This is a problem because you cannot answer arbirtrary state questions:
eth_call
/eth_getStorageAt
at that block.
A new index can be introduce that stores when each slot was accessed.
eth_getStorageAt
/ eth_call
) at old block height (e.g,. block 15_000_000)This index (slot access history) means that arbitrary traces are possible. They require two network requests per novel slot (one to get slot history, one to get slot value).
An overlay network distribute the slot access history index.
0x1122...3344
) has a distance. If distance is within a radius, the node keeps
the index data.Or, as a network message that is passed toward nodes with IDs within the radius of 0x1122...
:
curl -s -H "Content-Type: application/json" -d '{"method":"portal_stateSlotAccessPriorTo","params":["0x1122...3344", "1500000"],"id":1,"jsonrpc":"2.0"}' http://localhost:8545
Result (where 0x1485e8
is 1345000
):
{
"jsonrpc": "2.0",
"id": 1,
"result": "0x1485e8"
}
Then that response can be used to with eth_getStorageAt
:
curl -s -H "Content-Type: application/json" -d '{"method":"eth_getStorageAt","params":["0x1122...3344", "1345000"],"id":1,"jsonrpc":"2.0"}' http://localhost:8545
Result (where the slot value was 1 when it was last accessed prior to the block in question, and since it has not changed )
{
"jsonrpc": "2.0",
"id": 1,
"result": "0x0000...0001"
}
A node would be responsible for storing a subset of the Accessed State Index. That is, only
some state keys. For each key they record a binary accessed/not-accessed for every block.
This can be represented in the simple case as an uncompressed bitmap, which
for each key would be 17_000_000 / 32 = 500K 32-bit integers = 5MB per key
. From this
you could look at how many state keys there are on mainnet and calculate the worst-case upper
bound for this index.
eth_debugTrace*
- nodes store all the data necessary for specific blocks, so tracing is fast once
an appropriate node is found.
eth_getStorageAt
- Getting individual historical arbitrary slots is quick: Two network requests.
simple debug_traceCall
/ eth_call
: Basic tasks like getting a token balance or ENS resolver value at a specific height is fast.
complex debug_traceCall
/ eth_call
: Tracing counterfactual transactions is slower. Each new state requires two lookups so the time taken is O(n). This is sequential, so a trace involving 100 state lookups would require
200 network requests.
Suppose a node has 5% of the total database size. Then it will have ~1-in-20 blocks locally.
Deduplication of common leaves: Storage of code by code hash to avoid storing an unchanged contract code that is read (accessed) in different blocks.
A statistical analysis revealed that state trees were about ~8 deep. This is 8 levels each with 15 siblings. Blocks have multiple state accessess so some of these siblings would lead to leaves, but many would not.
There are likely to be large subtrees that overlap. These would be sets of siblings nodes that were unchanged between sequential blocks (but non-contiguous, e.g., 15_000_000 and 15_000_0020) for that node. These sets of siblings and the subtrees they form could be stored efficiently by searching the trees for such patterns.
A simple example: Where set of 15 siblings is unchanged between two sequential proofs, replace the 15 nodes with a hash of those nodes and only store them once.
An additional Accessed State Index (ASI) allows a node that has the block state proof for
block N
to answer state queries for slots accessed in that block - for all blocks moving forward
until that slot is accessed again x blocks later. That is, the respond to state queries
for blocks N + x
, coordinated by a distributed index.
This allows for fast access ~O(1) to archive state in the simple case and O(n) access in complex counterfactual state queries.
This post is explores the DoS potential of the ASP (Accessed State Proof) model for archive node state (ANS)
First, a review of which part of the state block proofs provide. Proofs can only be constructed if you have access to the full state trie. So, they are not suited to being the mechanism for the chain tip, where a new block can touch any part of state.
One design is to leave the state overlay, which providies Full Node State (FNS) up to blocks 256 deep, as is. A separate Archive State overlay can then be responsible for providing state for blocks 256 and onwards. This gives nodes the time to produce and check proofs and construct accumulator sealing valid proofs for rapid verification from then onwards.
---
title: Blocks older than 256 become the domain of the Archive State Overlay
---
flowchart TB
C -- Update state--> B -- after 256 blocks--> A
subgraph A[Archive Node State ANS via Accessed State Proofs ASP]
direction BT
subgraph aACC[Accumulator]
direction TB
asr[State root]
abh[Block hash]
aph[hash of accessed state proof ASP]
end
subgraph aASP[Accessed State Proofs ASPs]
direction TB
Proof
ap1bh[Block hash]
end
subgraph aASI[Accessed state index ASI]
direction TB
aasiacc[When each slot was accessed]
end
end
subgraph B[Full Node State FNS via distributed trie design]
direction BT
accounts
storage
contracts
end
C[Bridge injects new blocks created by EL/CL]
N1[state overlay nodes store particular accounts, keys, contracts]
N2[archive state overlay nodes store proofs for particular blocks]
N3[archive state overlay nodes store appearances for particular state keys]
N4[Proofs are constructed by tracing a block, aggregating values, calling eth_getProof ]
N1 -.- B
N4 -.- A
A -.- N2 & N3
Nodes constructing proofs trace recent blocks. They could use either the State and History overlays
to do this, or they could use a bridge connected to non-Portal node that implements
eth_getProof
.
Invalid proofs are expensive to create but easy to check. Once an ASP is deemed valid it may still be "bad" with respect to the block they are paired with by containing:
The block must be executed to differentiate the bad-but-valid proofs from the ideal proof: one that only contains the requisite state.
A block header is required to know that the ASP root is correct.
A block header (gas price, block number) and a block body (e.g., list of transactions) is required to execute the block and check that a valid ASP truly contains all the required state and not more.
Thus, on the state overlay network, validity of a proof requires access to the history network (they are not completely isolated). This can be addressed through an accumulator of block states (below), but it is also true that the state data is reliant on the history data practically too. A request for a transaction trace involves both state and history networks. So separation goals are somewhat artificial in this regard.
It would be ideal if the data that is requested of nodes, happens to also be colocated on the same nodes. This would minimise network lookups (the most sensitive part of the Portal Network).
For example, the content id of ASPs could be designed to match that of block bodes, so nodes by default have the majority of the data for a single block at hand.
Here is an example system that achieves this goal, where ASP content key is defined by the block hash.
Name | Network | Key | ID | Node responsible | Size |
---|---|---|---|---|---|
Header | History | 00 + block_hash | 0xabc...123 | A "~0xabc" | small |
Body | History | 01 + block_hash | 0xdef...789 | B "~0xdef" | medium |
ASP | Archive State | 01 + block_hash | 0xdef...789 | B "~0xdef" | large |
Any ASP with that enters the state network will be directed to node B. That node makes a single request to get the block header. At this point, the ASP proof can be checked and then the block replayed to check the proof contains the minimum-viable state.
Following the verification that an ASP is valid and ideal (not insufficient/bloated) it could be canonicalized in an accumulator to enable the state network to decouple from the history network. That is, once the state at the chain tip is checked (requires history network), the state network could become self sufficient.
An accumulator containing:
By embedding the hash of the ASP a new node entering the network will have the following capacity: To received an ASP for an old block, to check that the ASP is correct and that the proof was used to execute the block and it was shown to contain the necessary information. In this way, the node may participiate in the State overlay without compromising on guarantees about what data they are storing.
To satisfy the optimisation "nodes that check proofs should already have most data locally", the following content design may be useful.
Main property: define ASPs by block hash, rather than by state root.
accessed_state_proof = TBD # Merkle proof in SSZ format.
block_hash = Bytes32
# Keep the block hash with the proof so that its key may be readily derived
asp_content = Container(
block_hash: block_hash,
proof: accessed_state_proof
)
asp_selector = 0x01 # Matches the selector in the History network.
asp_content_key = asp_selector + block_hash
asp_content_id = keccak(asp_content_key)
If instead the ASP content was defined by the state root, then when a new ASP enters the network it will be directed to a node that does not have block bodies.
All new ASPs for a particular block will be directed to the same nodes, they will check the proofs, and if correct and ideal, witll extend the state accumulator (defined below). They then send the latest accumulator to the rest of the network.
The accumulator records block header data and a hash of the ASP.
accessed_state_proof_hash = keccak(accessed_state_proof)
# An individual record for a historical state root.
state_root_record = Container(
state_root: Bytes32, # from block header
block_hash: Bytes32, # from block header
accessed_state_proof_hash: accessed_state_proof_hash
)
With accumulator structure and algorithm following the same protocol as the header accumulator in the History network.
EPOCH_SIZE = 8192 # blocks
MAX_HISTORICAL_EPOCHS = 131072 # 2**17
# The records of the state roots from within a single epoch
state_epoch_accumulator = List[state_root_record, max_length=EPOCH_SIZE]
state_master_accumulator = Container(
historical_epochs: List[bytes32, max_length=MAX_HISTORICAL_EPOCHS],
current_epoch: state_epoch_accumulator,
)
state_epoch_hash = hash_tree_root(state_epoch_accumulator)
state_accumulator_content = SSZ.serialize(state_epoch_accumulator)
state_accumulator_selector = 0x00
state_accumulator_content_key = state_accumulator_selector + state_epoch_hash
state_accumulator_content_id = keccak(state_accumulator_content_key)
With the above design, a malicious node can produce bad proofs at the chain tip. They construct proofs that have valid state, but insufficient/bloated values with respect to the block.
Defense involves replaying that block using the proof data. Then disseminating an accumulator so that in the future this process does not have to be repeated.
An attacker could produce bloated/insufficient proofs at the front of the accumulator. To decide which accumulator value is correct, a node would have to replay the block. A bloated proof can be pruned down to an optimal proof. An insufficient proof can be rejected.
A flood of insufficient proofs represent a spam attack cheap to generate and costly to verify. A node will have to execute the block to reject this. If the time check all the spam proofs for a given block exceeds the block time (12s), then the node will not keep up with proof production.
Blocks are full at 30MGas, which is calibrated to limit state growth. EVM engines are capable of much higher speeds, approximately 10-100 blocks/second. Suppose a worst case of 1 block/second. An EVM could process ~10 blocks within 12 seconds. Hence, one mitigation is a rule where if a node is presented with more than 10 proofs per block height they can block the offending peer.
Importantly, executing the block is EVM-speed because the node has the data locally.
Some Accessed State Proofs may touch so much state that they are very large - perhaps >100MB at a guess. This is dependent on the content of the blocks being produced, which may be for a myriad of reasons.
Fundamentally, if such a block is to be replayed, this state must be available. If the data model is ASP, then a request for the state of a particular block could invoke a ~100MB transfer.
Alternatively, if the state is distributed (ANS via FNS, without accessed state proofs), then the data will have to be retrieved from many peers, and likely represents a block that is very difficult to trace at all. For example, executing a small part of the block, then fetching state, then more execution and then more fetching, likely hundreds of times sequentially. Each request being for a proof for a single piece of state. In practice, this results in the same order of data transfer (~100MB), but split over a long period.
If it is the intention to be able to trace all blocks, then getting everything required for a block in a single request seems good - and for state-heavy blocks, this could mean the difference between succeeding or not-succeeding with the trace in a timely manner.
Description
The portal network does not provide archive node functionality. Could it?
Motivation
Exploration of new user types for the Portal network.
Currently a portal network node could be of use to someone wanting full-node functionality post-EIP4444 (history expiry). That includes access to historical transactions and receipts. Yet this is an incomplete representation of the transaction.
At present there are popular websites that provide transaction data. Block explorers and individual transaction explorers such as: https://ethtx.info/mainnet/0x1737d3cb7407b4fbf17e152e2d95cabca691e5fdcc8ae67a48c1a11d4809fe78/
What if instead of seeking such websites, a user could have a similar interface that sources data from their portal node? Ultimately, deep transaction introspection of isolated transactions has proven to be a popular use case. It provides a critical function that the community requires: what happened / what went wrong. As this functionality is highly desired, it could be a major reason people run Portal nodes.
Preliminary research
Providing archive access likely involves an inefficient representation of data as compared to a normal archive node. Yet this is what the portal network is designed for: to share such a load among many peers.
https://perama-v.github.io/ethereum/protocol/archive
The portal network already has block header accumulators. So blocks can be packaged along with the state that they require to be traced. They can then be distributed in the network trustlessly. Transactions inside them can then be replayed and traced by replaying the block.