Open frisitano opened 6 months ago
How did you obtain the trace? I believe this block is part of the ones @praetoriansentry ran (fetching block traces from Jerigon) and it got proven fine, at least on a given version of zk_evm
. Can you share your setup with commit revisions for which it is failing?
I generated the trace using the native
tracer I am developing. zk_evm
I am using is from this PR: https://github.com/0xPolygonZero/zk_evm/pull/157.
I just tested against the mentioned Jerigon witness we used previously and latest zk_evm@develop
backend prover, and I didn't encounter any issue with this txn:
2024-04-30T16:06:04.309309Z INFO p_gen: evm_arithmetization::generation::state: CPU halted after 177526 cycles id="b19240674 - 144"
2024-04-30T16:06:04.311728Z INFO p_gen: ops: txn proof (f272b5c901812d981a57ee366d0b25cb68835be82fe25068f99f763b5e0bc3ef) took 159.025625ms id="b19240674 - 144"
so the issue may be coming from your end. I'll have a look. Would you happen to have the block trace / kernel trace logs?
Kernel trace logs for this transaction:
Will produce block trace now.
Yep, sounds like its something on my end.
It seems you're not having a valid state trie to begin with (although you are passing to the kernel witness that reconstructs the same trie).
Cycle 28699, ctx=0, pc=45084, below hash_initial_tries, instruction=MloadGeneral, stack=[21474836487, 15624320678034521268530137619419123205812357239735782298755735174456886811719, 911]
Cycle 28700, ctx=0, pc=45085, below hash_initial_tries, instruction=BinaryArithmetic(Sub), stack=[15624320678034521268530137619419123205812357239735782298755735174456886811719, 15624320678034521268530137619419123205812357239735782298755735174456886811719, 911]
Your state trie root pre-txn execution is 156243206780345212685301376194191232058123572397357822987557351744568868117191
.
On Jerigon side, it is
Cycle 28699, ctx=0, pc=45084, below hash_initial_tries, instruction=MloadGeneral, stack=[21474836487, 95304458244551979935727276103143456771384068114772563680996672004691695516135, 911] id="b19240674 - 144"
Cycle 28700, ctx=0, pc=45085, below hash_initial_tries, instruction=BinaryArithmetic(Sub), stack=[95304458244551979935727276103143456771384068114772563680996672004691695516135, 95304458244551979935727276103143456771384068114772563680996672004691695516135, 911] id="b19240674 - 144"
i.e. 95304458244551979935727276103143456771384068114772563680996672004691695516135
.
I've attached the jerigon block payload and txn trace. traces.zip
Hmm that is peculiar. I'll try and investigate what is going on. I've attached the native block witness. It's hard to compare the tx_infos
as it looks like jerigon includes storage writes in which the value is the same as the original value in the slot so it is in essence a noop - the native tracer does not do this.
jerigon includes storage writes in which the value is the same as the original value
I may be wrong on this, but I believe the underlying reason is because if we need to perform a SSTORE (regardless of the value), we'll need to access the storage slot and hence need to provide its witness.
I may be wrong on this, but I believe the underlying reason is because if we need to perform a SSTORE (regardless of the value), we'll need to access the storage slot and hence need to provide its witness.
Yeah we certainly need to include it in the state witness (which we do in the native tracer) but I don't think we need to include it in storage_writes of tx info. I believe in some cases it also includes nonce even when it doesn't change.
I think there may be a bug in the mpt_trie
implementation. I have two partial tries with the same initial root. I perform the same operations on both tries (one insertion and one deletion) and after the operations the two tries yield different roots. It looks like it's the delete operation that is responsible for the divergence. The difference between the two tries is that the native trie has a hash node in the place of a leaf node for the jerigon trie but other than that they are the same.
I have attached a zip with the source code used to run this test along with the jerigon and native tries. You can compile and run with: cat jerigon_storage_tree.json | ./target/release/trie_test
and cat native_storage_tree.json | ./target/release/trie_test
.
❯ cat jerigon_storage_tree.json | ./target/release/trie_test
initial hash: 0xcfc5…dc6b
insert_key: Nibbles { count: 64, packed: "0xe7c11d5270a96d8ff353e9c32fc53375eaeedd3efe2be73798d560e1d8c1f299" }
insert_value: [137, 14, 108, 227, 242, 56, 61, 189, 73, 108]
do insert
hash after insert: 0xb93f…8fa9
delete_key: Nibbles { count: 64, packed: "0xcec0d88d45c06fe2864991abe6cf6a1fbbd064d8166c0c24cd36eeb1016f9282" }
value in slot before delete: Some([136, 10, 0, 98, 2, 82, 207, 242, 133])
hash after delete: 0x26ea…7af6
❯ cat native_storage_tree.json | ./target/release/trie_test
initial hash: 0xcfc5…dc6b
insert_key: Nibbles { count: 64, packed: "0xe7c11d5270a96d8ff353e9c32fc53375eaeedd3efe2be73798d560e1d8c1f299" }
insert_value: [137, 14, 108, 227, 242, 56, 61, 189, 73, 108]
do insert
hash after insert: 0xb93f…8fa9
delete_key: Nibbles { count: 64, packed: "0xcec0d88d45c06fe2864991abe6cf6a1fbbd064d8166c0c24cd36eeb1016f9282" }
value in slot before delete: Some([136, 10, 0, 98, 2, 82, 207, 242, 133])
hash after delete: 0xd578…7641
@BGluth I haven't followed this topic very closely. Have we started investigating the cause of the mpt_trie
issue? Do we know where this is coming from / has it been fixed already?
Yeah sorry, I probably should have posted an update here. This was discussed on Slack with @frisitano, and the cause is the infamous issue of the full node providing this:
B
/ \
L H
when a txn ends up deleting the leaf L
and collapsing it into an extension node. In this case, an extension node E
pointing to a hash node won't allow us to collapse it if the hash node happens to be a leaf node (since we have no idea what type of node got hashed). Since E --> L
produces a different hash than a collapsed node L'
, we produce the wrong hash.
So yeah. I guess in short, this is an issue with the full node and I think we're good to close this. As an aside, we really need to document this somewhere as this is not very obvious and critical for full node implementators.
I think you had proposed raising an error in the case in which we do not have sufficient information to collapse the branch node instead of producing an incorrect trie. Has this error logic been implemented?
Ah right, that did come up. I haven't impled this yet, but I opened an issue (#237).
It may be happening with Jerigon as well. The state trie obtained after application of the deltas of the first txn in this block below is invalid. Testable with the feat/cancun
branch.
witness-0014.json
I am running into what appears to be an invalid state root calculation when executing the following transaction.
Transaction trace:
Logs: