0xPolygonZero / zk_evm

Apache License 2.0
69 stars 20 forks source link

Native RPC issues #343

Open frisitano opened 1 week ago

frisitano commented 1 week ago

We have identified two RPC issues, these are:

1) entered unreachable code associated with the following code:

https://github.com/0xPolygonZero/zk_evm/blob/d81d68380e1209030c91508adcb9007c47441fdb/zero_bin/rpc/src/native/txn.rs#L77-L82

It turns out that the RPC provider returns an incorrect enum variant. The following example is taken from block 20169213 and txn hash 0xa14a8b6a069a35100a292320a3d28df14a69d67cc71677f2098d782150d2a736 using quicknode as the rpc provider. We would expect the enum variant to be GethTrace::PreStateTracer(PreStateFrame::Default(read)) for the pre_trace but instead we receive GethTrace::JS(...) as seen below:

JS(Object {"0x95222290dd7278aa3ddd389cc1e1d165cc4bafe5": Object {"balance": String("0x1062d0cd4fd67ef55"), "nonce": Number(1116079)}, "0xa09b228ad69a2f7bc95e559667e94873d93d1683": Object {"balance": String("0x4fe63a04d9ad497c"), "nonce": Number(5)}, "0xa1e166e2e6908a7ffba440b8f960ddd02148149a": Object {"balance": String("0x1446a63f7219085"), "nonce": Number(21)}, "0xbbc9d47c836659e74a8af1252713b71ed45fd53b": Object {"balance": String("0x16345785d8a0000"), "nonce": Number(1)}, "0xcde694e57f20a27d9404790e86f471943c412caf": Object {"balance": String("0x0"), "nonce": Number(1)}, "0xdd20f7fb1eaa440dc051b90f3370ed1cc77c9e17": Object {"balance": String("0x0"), "nonce": Number(1)}, "0xf5d5193e0dcc8e49c151713c5b456e15cd4e7c5e": Object {"balance": String("-0x1bc16d674ec80000")}})

If we observe the data we can see that for account 0xf5d5193e0dcc8e49c151713c5b456e15cd4e7c5e we are being returned an invalid balance -0x1bc16d674ec80000 which I suspect is causing alloy to parse this as a custom JS response.

When we look at this transaction in etherescan we can see the balance for this account should be 0 - https://etherscan.io/tx/0xa14a8b6a069a35100a292320a3d28df14a69d67cc71677f2098d782150d2a736#statechange.

Interestingly this appears to be an rpc provider specific issue. Lets observe the response for the following request from quicknode and alchemy.

Request:

{"method":"debug_traceTransaction","params":["0xa14a8b6a069a35100a292320a3d28df14a69d67cc71677f2098d782150d2a736", {"tracer": "prestateTracer", "tracerConfig": {"diffMode": true}} ], "id":1,"jsonrpc":"2.0"}

quicknode:

{
    "jsonrpc": "2.0",
    "id": 1,
    "result": {
        "post": {
            "0x95222290dd7278aa3ddd389cc1e1d165cc4bafe5": {
                "balance": "0x1062eb7a16d0f4b55"
            },
            "0xa09b228ad69a2f7bc95e559667e94873d93d1683": {
                "balance": "0x185fb9466227716e",
                "nonce": 7
            },
            "0xa1e166e2e6908a7ffba440b8f960ddd02148149a": {
                "balance": "0x834c5bdcad39085"
            },
            "0xbbc9d47c836659e74a8af1252713b71ed45fd53b": {
                "balance": "0x853a0d2313c0000"
            },
            "0xcde694e57f20a27d9404790e86f471943c412caf": {
                "balance": "0x6f05b59d3b20000"
            },
            "0xdd20f7fb1eaa440dc051b90f3370ed1cc77c9e17": {
                "balance": "0x6f05b59d3b20000"
            },
            "0xf5d5193e0dcc8e49c151713c5b456e15cd4e7c5e": {
                "balance": "0x0",
                "nonce": 1
            }
        },
        "pre": {
            "0x95222290dd7278aa3ddd389cc1e1d165cc4bafe5": {
                "balance": "0x1062d0cd4fd67ef55",
                "nonce": 1116079
            },
            "0xa09b228ad69a2f7bc95e559667e94873d93d1683": {
                "balance": "0x4fe63a04d9ad497c",
                "nonce": 5
            },
            "0xa1e166e2e6908a7ffba440b8f960ddd02148149a": {
                "balance": "0x1446a63f7219085",
                "nonce": 21
            },
            "0xbbc9d47c836659e74a8af1252713b71ed45fd53b": {
                "balance": "0x16345785d8a0000",
                "nonce": 1
            },
            "0xcde694e57f20a27d9404790e86f471943c412caf": {
                "balance": "0x0",
                "nonce": 1
            },
            "0xdd20f7fb1eaa440dc051b90f3370ed1cc77c9e17": {
                "balance": "0x0",
                "nonce": 1
            },
            "0xf5d5193e0dcc8e49c151713c5b456e15cd4e7c5e": {
                "balance": "-0x1bc16d674ec80000"
            }
        }
    }
}

alchemy:

{
    "jsonrpc": "2.0",
    "id": 1,
    "result": {
        "post": {
            "0x95222290dd7278aa3ddd389cc1e1d165cc4bafe5": {
                "balance": "0x1062eb7a16d0f4b55"
            },
            "0xa09b228ad69a2f7bc95e559667e94873d93d1683": {
                "balance": "0x185fb9466227716e",
                "nonce": 7
            },
            "0xa1e166e2e6908a7ffba440b8f960ddd02148149a": {
                "balance": "0x834c5bdcad39085"
            },
            "0xbbc9d47c836659e74a8af1252713b71ed45fd53b": {
                "balance": "0x853a0d2313c0000"
            },
            "0xcde694e57f20a27d9404790e86f471943c412caf": {
                "balance": "0x6f05b59d3b20000"
            },
            "0xdd20f7fb1eaa440dc051b90f3370ed1cc77c9e17": {
                "balance": "0x6f05b59d3b20000"
            },
            "0xf5d5193e0dcc8e49c151713c5b456e15cd4e7c5e": {
                "nonce": 1
            }
        },
        "pre": {
            "0x95222290dd7278aa3ddd389cc1e1d165cc4bafe5": {
                "balance": "0x1062d0cd4fd67ef55",
                "nonce": 1116079
            },
            "0xa09b228ad69a2f7bc95e559667e94873d93d1683": {
                "balance": "0x3424cc9d8ae5497c",
                "nonce": 6
            },
            "0xa1e166e2e6908a7ffba440b8f960ddd02148149a": {
                "balance": "0x1446a63f7219085",
                "nonce": 21
            },
            "0xbbc9d47c836659e74a8af1252713b71ed45fd53b": {
                "balance": "0x16345785d8a0000",
                "nonce": 1
            },
            "0xcde694e57f20a27d9404790e86f471943c412caf": {
                "balance": "0x0",
                "nonce": 1
            },
            "0xdd20f7fb1eaa440dc051b90f3370ed1cc77c9e17": {
                "balance": "0x0",
                "nonce": 1
            }
        }
    }
}

Running this block against the alchemy rpc results in successful witness generation as such I conclude that this is an rpc provider specific issue.

 2) Failed to get proof for account when requesting account state witness data

It appears that when making requests for state witness data, in some instances the rpc provider returns an error. If we take the example of block 20169553 and account 0xb8901acB165ed027E32754E0FFe830802919727f with associated keys as seen in the request below:

{"jsonrpc":"2.0","method":"eth_getProof","params":["0xb8901acB165ed027E32754E0FFe830802919727f",["0xa1e0ef2240f3b01c0b2d4eaca4ff152866d393bb1c6df4405ca50efde3b8fb16", "0xf4c6c4ef40ee5c6ee938819600dc3e836fff624f4b84db4d05d1b21b9b89f962", "0x13cf322f7e0bfa675def40e707c44e02b90d3336a6f94a99bef4e39ad34ef10d", "0x812c8fc731d689ed5ded39e7869bf6693f18a74a2f3cb1feb1b6e33f7014843a", "0x17cbec8ebc0653de63a5338c3cc4f7ca2a7c16633ad26315800047cdf65858e7", "0x12b90a824975a1a225860b45a23817ce04f04b82ed957fe4bb6adaa44e4689ab", "0xc84359bb210f06b78e364d920ca734de6bcbc2de93362ff2e8da378632ef0f35", "0x628489fccd64b5cd0bd725fcaff542b0d6733023cff77c7a7cb2a03e7f771fdf", "0x50cd427763160a6da5b63b9387460598205067628d075d83ff63ed9ba98a482a", "0x0000000000000000000000000000000000000000000000000000000000000000", "0xd9212918c2cbd49caf7ce3c1c42c493e0e84c6971d6dd2369fbdfb2e4421af22", "0xf19d4abf7fec7322687462b4a6e2e46a0ee39524e53896980b8d3454d776762d", "0x000000000000000000000000000000000000000000000000000000000000000e"],"0x133c351"],"id":1}

The rpc response from quicknode is:

{
    "jsonrpc": "2.0",
    "error": {
        "code": -32602,
        "message": "Value cannot be null. (Parameter 'key')",
        "data": "System.ArgumentNullException: Value cannot be null. (Parameter 'key')\n   at System.Collections.Generic.Dictionary`2.FindValue(TKey key)\n   at Nethermind.State.Proofs.AccountProofCollector.ShouldVisit(Hash256 nextNode)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.TrieNode.Accept[TNodeContext](ITreeVisitor`1 visitor, TNodeContext& nodeContext, ITrieNodeResolver nodeResolver, TreePath& path, TrieVisitContext trieVisitContext)\n   at Nethermind.Trie.PatriciaTree.Accept[TNodeContext](ITreeVisitor`1 visitor, Hash256 rootHash, VisitingOptions visitingOptions, Hash256 storageAddr, Hash256 storageRoot)\n   at Nethermind.JsonRpc.Modules.Eth.EthRpcModule.eth_getProof(Address accountAddress, UInt256[] storageKeys, BlockParameter blockParameter)\n   at System.Reflection.MethodInvoker.InvokeImpl(Object obj, Object arg1, Object arg2, Object arg3, Object arg4)\n   at System.Reflection.MethodInvoker.Invoke(Object obj, Span`1 arguments)\n   at Nethermind.JsonRpc.JsonRpcService.ExecuteAsync(JsonRpcRequest request, String methodName, ResolvedMethodInfo method, JsonRpcContext context)"
    },
    "id": 1
}

Using alchemy as the rpc provided does in fact yield a response, however this response also yields an error associated with the storage proof, specifically trusted rlp should be valid: RlpExpectedToBeData. I have further investigated the problem here and will be opening a related issue to address this.

Conclusion

Both of these issues appear to be associated with the reliability of data returned from rpc providers. We should find a solution where we can improve the reliability of rpc data. My suggestion would be to host a dedicated archive node.

frisitano commented 6 days ago

I have had a response from alchemy suggesting that they have fixed their rpc service. It would be good to run a test to see how it performs now.

Nashtare commented 6 days ago

That's great! I'll kick off some tests today then.

Nashtare commented 5 days ago

@frisitano It seems that alchemy is still returning erroneous data, or the native tracer may be having some issues. I have got a few seemingly erroneous witnesses, which failed generating proofs at some point, while their jerigon counterpart worked fine. Note that these are cancun blocks (you can test them against feat/cancun branch).

Failed blocks with alchemy:

First and last got a mpt_read_hash_node error, so possibly coming from native tracer edge cases.

frisitano commented 1 day ago

I've taken a look into this issue. I have the following suspicions:

In conclusion I think these errors are associated with native tracer edge cases rather than rpc provider issues.