paradigmxyz / reth

Modular, contributor-friendly and blazing-fast implementation of the Ethereum protocol, in Rust
https://reth.rs/
Apache License 2.0
3.78k stars 1.02k forks source link

trace compatibility testing with Trueblocks #3721

Open wakamex opened 1 year ago

wakamex commented 1 year ago

Describe the bug

trying to create a local trueblocks unchained index, I'm getting some warnings:

Unknown trace type selfdestruct
Unknown trace type selfdestruct
Unknown trace type selfdestructng 391  of 2000 at block 11182389 of 17659147 (6476758 blocks from head)
Unknown trace type selfdestruct
Unknown trace type selfdestruct

I'm not clear how impactful this particular warning is to Trueblocks operation, but it points to something non-standard in Reth's trace implementation.

Digging into this further, I couldn't find specifically where in the trueblocks or erigon codebases they deal with selfdestruct in a way that could explain this discrepancy.

However, Trueblocks has a long list of tests for expected tracing functionality in /src/other/trace_tests/curl_tests. Running these on my reth node, I get a difference from expected for 26 out of 35 tests. You can find the diffs of those tests in this PR: https://github.com/TrueBlocks/trueblocks-core/pull/3037/files

@tjayrush could explain these further.

Steps to reproduce

reproduce warnings above:

  1. sync erigon
  2. install trueblocks
  3. run chifra scrape

get comparison vs. expected Trueblocks trace tests

  1. run_all in /src/other/trace_tests/curl_tests of trueblocks repo (modify line 3 as needed: export RPC_ENDPOINT="http://localhost:23456")
  2. compare_expected_produced.sh

Node logs

No response

Platform(s)

Linux (x86)

What version/commit are you on?

reth Version: 0.1.0-alpha.2 Commit SHA: 1330fc11 Build Timestamp: 2023-07-08T22:15:01.820025074Z Build Features:
Build Profile: release

What database version are you on?

Current database version: 1 Local database is uninitialized

If you've built Reth from source, provide the full command you used

cargo build --release

Code of Conduct

mattsse commented 1 year ago

this is very helpful, thanks a lot

looking into this

tjayrush commented 1 year ago

Thanks @wakamex. @mattsse, please makes sure to ping me either in discord or here if you have any questions about any of these tests. We wrote these tests when we were porting from OpenEthereum to Erigon and as far as I know they worked about two and a half years ago with Erigon.

Things may have changed under the covers with Erigon (although I hope not). So, if a test does not seem right, don't assume there's something wrong with Reth. Ping me and I can help you figure it out.

I'm going to install Reth this weekend, so should have a locally running copy soon to help test this.

mattsse commented 1 year ago

looked at the unknown trace type error, and turns out the OE traces never migrated from suicide to selfdestruct

https://github.com/TrueBlocks/trueblocks-core/blob/7c65387d5d5aa97af7ccf6269281f200711a74ca/src/apps/chifra/internal/scrape/handle_blaze_blaze.go#L294-L294

so in order to be compatible I think we should rename this to suicide as well

mattsse commented 1 year ago

@tjayrush @wakamex

is the insufficient funds error erigon, and the success response reth?

https://github.com/TrueBlocks/trueblocks-core/pull/3037/files#diff-b294536ca537aa07d86d1b87ba6bf94552732b947fb6ecf1fdf06c5d8279557aR7-R17

wakamex commented 1 year ago

is the insufficient funds error erigon, and the success response reth?

yup, you can see the reth output for each test in /src/other/trace_tests/curl_tests/reth

for that test in src/other/trace_tests/curl_tests/reth/trace_callMany_all.json we have: https://github.com/TrueBlocks/trueblocks-core/pull/3037/files?file-filters%5B%5D=.json&show-viewed-files=true#diff-43fbbd2af40737486871fc9b12f73ae842426b16a37678e8b5ece40be1be9f78R1-R8

Perhaps more useful would be running these tests again today against an erigon node, as that seems to be a better source of truth than OE?

tjayrush commented 1 year ago

We should definitely run the tests again against Erigon as these 'expected' results are about two and a half years old. While it would be surprising if they changed things, it's very possible that they did.

tjayrush commented 1 year ago

Just a quick note. These tests are purposefully using curl (as opposed to say, TrueBlocks) so as to eliminate any weird processing in TrueBlocks. The curl responses from both nodes should be identical (at least that's what I think).

mattsse commented 1 year ago

I think the insufficient fund error can be explained by:

https://github.com/ledgerwatch/erigon/blob/af504f675a84868b941fbad840a3d2819fb525bf/turbo/jsonrpc/trace_adhoc.go#L967-L967

which is different from eth_call for example:

https://github.com/ledgerwatch/erigon/blob/af504f675a84868b941fbad840a3d2819fb525bf/turbo/transactions/call.go#L86-L86

note the disabled basefee, reth traceCall currently behaves as eth_call: basefee disabled, gas price 0

which differs from erigon

mattsse commented 1 year ago

it's still suicide

https://github.com/ledgerwatch/erigon/blob/af504f675a84868b941fbad840a3d2819fb525bf/turbo/jsonrpc/trace_adhoc.go#L40-L40

so I changed selfdestruct to suicide here for compact https://github.com/paradigmxyz/reth/pull/3736

tjayrush commented 1 year ago

We should probably check against a current version of Erigon. These 'expected' results are two and a half years old. @wakamex Do you have an Erigon node?

gakonst commented 1 year ago

@tjayrush reaching out in DMs with an RPC.

wakamex commented 1 year ago

I got my hands on an erigon node. Here are the reth vs. erigon results:

Comparing reth_2023_07_11/ to erigon_2023_07_12/, placing results into reth_erigon/
Command executed on Wed Jul 12 19:36:46 UTC 2023
=== Not found ===

=== Identical ===
erigon_2023_07_12/trace_get_0.json
erigon_2023_07_12/trace_get_3.json
erigon_2023_07_12/trace_replayBlockTransactions_trace.json
erigon_2023_07_12/trace_replayTransaction_all.json
erigon_2023_07_12/trace_replayTransaction_trace.json
=== Difference found ===
erigon_2023_07_12/trace_block.json
erigon_2023_07_12/trace_callMany_all.json
erigon_2023_07_12/trace_callMany_none.json
erigon_2023_07_12/trace_callMany_stateDiff.json
erigon_2023_07_12/trace_callMany_trace.json
erigon_2023_07_12/trace_callMany_vmTrace.json
erigon_2023_07_12/trace_call_all.json
erigon_2023_07_12/trace_call_error.json
erigon_2023_07_12/trace_call_none.json
erigon_2023_07_12/trace_call_stateDiff.json
erigon_2023_07_12/trace_call_trace.json
erigon_2023_07_12/trace_call_view.json
erigon_2023_07_12/trace_call_view_none.json
erigon_2023_07_12/trace_call_vmTrace.json
erigon_2023_07_12/trace_filter.json
erigon_2023_07_12/trace_get_10.json
erigon_2023_07_12/trace_get_5.json
erigon_2023_07_12/trace_rawTransaction_none.json
erigon_2023_07_12/trace_rawTransaction_trace.json
erigon_2023_07_12/trace_rawTransaction_vmTrace.json
erigon_2023_07_12/trace_replayBlockTransactions_all.json
erigon_2023_07_12/trace_replayBlockTransactions_none.json
erigon_2023_07_12/trace_replayBlockTransactions_stateDiff.json
erigon_2023_07_12/trace_replayBlockTransactions_vmTrace.json
erigon_2023_07_12/trace_replayTransaction_none.json
erigon_2023_07_12/trace_replayTransaction_stateDiff.json
erigon_2023_07_12/trace_replayTransaction_vmTrace.json
erigon_2023_07_12/trace_transaction.json

You can find the diffs here: https://github.com/wakamex/trueblocks-core/tree/compare_script/src/other/trace_tests/curl_tests/reth_erigon

See this PR for the scripts I used, or run it yourselves: https://github.com/TrueBlocks/trueblocks-core/pull/3037

tjayrush commented 1 year ago

I thought I'd add some color to some of these test cases. This is from memory, so take it with a grain of salt, but...


These failed tests return null instead of [] for traceAddress. This should be simple to fix.


We thought that someone had hand-coded assembly language for this transaction and had made a mistake, so reporting an error is correct. The error, I think, is correct if you look at the actual data. Not sure why it's saying something about the nonce.


Be a little bit careful with this one. I'm not sure this is the exact transaction, but I remember that OpenEthereum may have had a bug in certain places where the first trace in some transactions was incorrectly reported. Erigon found may have hit on this same issue (which is why this test is included). If I remember right, Erigon ended up adding a "compatibility" mode for some cases where they would act correctly by default but allow the user to cause old OpenEthereum bugs to persist. I don't know if this is such a case, but be careful here.


This may have been one of those backward compatibility things. The spec may say it requires a u64, but most of the other RPC endpoints require hex. So, as a compromise, even thought it's not in the spec, this was allowed. The spec is not really a spec after all.


Personally, I think better error messages are better, but this may break existing code. It won't break TrueBlocks, but it may break others.

mattsse commented 1 year ago

tysm for these

for trace_rawTransaction, I think the nonce error occurs because the transaction's nonce is lower that what's in latest state. so this tx would be invalid at this state

gakonst commented 1 year ago

We'll get back to these - thank you TJ, we're resyncing that node for now so it won't be available if you're trying to do more things.

wakamex commented 1 year ago

updated the baseline expectation in favor of modern Erigon in these benign cases: changed gas value, remove of null transactionHash and transactionPosition, improved error message (for more info see https://github.com/wakamex/trace_tests/pull/2)

this leaves only 2 categories of differences between this baseline and modern Erigon, which require further investigation:

result of test_trace_get.sh (Array is the second parameter as described here, table displays result.traceAddress field):

Array reth erigon
0 [0] [0]
1 [1] [1]
2 [2] [2]
3 [3] [3]
4 [4] [4]
5 [5] [4,0]
6 [6] [4,1]
7 null [4,2]
8 null [5]
9 null [5,0]
10 null [5,1]
11 null [5,2]
12 null [6]
13 null [6,0]
14 null [6,1]
15 null [6,2]
16 null null
tjayrush commented 1 year ago

Just one quick comment. The traceAddress example above is super interesting and important for getting the correct calling structure of the transaction. Basically, the array is a call tree. It's almost certainly zero-based, but it may be one-based. I can't remember. [6,2] is the sixth call from the second call in the top level (or seventh and third if zero-based, but you get the idea.)

I'm 99% sure the very top level (trace) is null. The top level trace basically is the transaction itself with slight (very confusing) differences in the gas calc.

mattsse commented 1 year ago

thanks for the example I was able to track this down,

previously we matched the indices to the trace address which explains the null values.

should be fixed in #3852

gakonst commented 1 year ago

What else do we need to close this? I spoke with Peter and he said 50M gas for tracing was a heuristic, so maybe we should just set it to that, and call it a day?

tjayrush commented 1 year ago

Are the tests passing? The tests represent previous behaviour.

yabirgb commented 1 year ago

@gakonst any update on this? Was the limit increased?

mattsse commented 1 year ago

yes, we also have some open tracing fixes that will land shortly, I will check compatibility again

tjayrush commented 1 year ago

I'm in the process of syncing my node. Once it's synced, the first thing I'll do is double-check these issues and try to make them more consumable (by creating a separate, much simpler, example for each outstanding issue).

mattsse commented 1 year ago

tysm for this,

fyi we're aiming for a new alpha release within the next few days

onbjerg commented 5 months ago

@wakamex @tjayrush Hi, I realize this issue is pretty old, but I am wondering if this is still an issue for you? We've had quite a few tracing fixes since the versions mentioned in here

erwin-wee commented 2 months ago

tysm for these

for trace_rawTransaction, I think the nonce error occurs because the transaction's nonce is lower that what's in latest state. so this tx would be invalid at this state

Hi @mattsse, is the node supposed to validate nonce against latest state for a trace of a transaction that's already happened in the past? As an example, for the following trace_transaction call on Ethereum Mainnet, Reth responds with "nonce too low" error, while the request goes through on other clients (e.g. Erigon).

{
    "jsonrpc": "2.0",
    "method": "trace_transaction",
    "params": [
        "0xbfbc675c06af4ed5ebdf35935fb506a41b0c4e1d382db39bbe1fa2ba528bdbb4"
    ],
    "id": 58938095
}