status-im / nimbus-eth1

Nimbus: an Ethereum Execution Client for Resource-Restricted Devices
https://status-im.github.io/nimbus-eth1/
Apache License 2.0
567 stars 115 forks source link

Develop a block validation debugging tool #171

Closed zah closed 5 years ago

zah commented 6 years ago

Introduction

Nimbus needs to be able to execute the entire Ethereum history and arrive at the same result as any other Ethereum client. The process of downloading blocks and validating them through exection is also known as "full blockchain sync". The entire block validation process is specified in detail in the Ethereum Yellow Paper and also in the Ethereum Biege Paper, which offers a more accessible description.

Currently, running nimbus without any argument will start a full sync that will eventually fail with a validation error around block 48000 (you may need to run nimbus several times, before reaching this block).

Each block consists of multiple transactions and each executed transaction is associated with a receipt object that includes a root merkle hash of the entire Ethereum state trie. When Nimbus and another client agree on a particular root hash of the state trie after some transaction T, we have a high confidence that Nimbus has executed all transactions up to this point correctly. Thus, the first transaction where Nimbus and another client disagree about the state root hash is likely to contain an EVM instruction that is still buggy.

The goal of this issue is to develop a tool that will be able to turn any encountered validation error into a simple reproducable test case that focuses on the short execution trace of the instrutions appearing in a single problematic transaction. To achive this we need the following:

Goals

Notes: In Nimbus, block validation starts with the persistBlocks proc defined in https://github.com/status-im/nimbus/blob/master/nimbus/p2p/chain.nim#L108

Notes: Look into the existing implementation of the debug_traceTransaction call available in Nimbus and Geth. If necessary, augment it further to include additional details such as the values of all accessed state nodes.

Notes: Automating the other client though JSON-RPC seems like a good way to achieve this. Our Nim JSON-RPC library can be used for this. (TODO: provide example).

Notes: The popular Remix debugging environment for Ethereum may serve as an inspiration for features that might help the developer pin point the problematic instruction and the relevant details about it.

Here is an example for examining the execution of a particular transaction: http://etherscan.io/remix?txhash=0x4fa57777c49c3303181546a08dd59626fcbc25a434f40d74ad32c35aeca2e46c On Etherscan, Remix is launched by clicking on the "tools & utilities" button available in the upper-right corner of the transaction details page. Remix is also available as a reusable open-source JavaScript library that can be integrated in other projects.

status-open-bounty commented 6 years ago

Balance: 0.0 ETH Contract address: 0x0b870a99c915bab5b1378d1a4f962772d627c6b5 Network: Mainnet Status: Pending maintainer confirmation Winner: jangko Visit https://openbounty.status.im to learn more.

tersec commented 6 years ago

Geth already has adequate tracing via debug.traceTransaction (presumably available via RPC as well, but I've not tested that).

https://stackoverflow.com/questions/46811130/cannot-debug-tracetransaction-in-geth-missing-trie-node https://github.com/ethereum/go-ethereum/wiki/Management-APIs#debug_tracetransaction https://ethereum.stackexchange.com/questions/9434/calling-debug-tracetransaction-from-web3-api/9437

It helps to run Geth via geth --syncmode full --gcmode archive.

etc.

yglukhov commented 6 years ago

No need for the bounty, we'll do it ourselves. Here's the basic tracer impl: https://github.com/status-im/nimbus/pull/173. Now the storage trie enumeration (@zah) and rpc_traceTransaction (@coffeepots) are needed?

zah commented 6 years ago

I think our team can focus on ETH 2.0 now and this can be left as a bounty. @jangko expressed interest to work on it.

arnetheduck commented 6 years ago

how does this tool compare with https://github.com/ethereum/evmlab?

zah commented 6 years ago

@arnetheduck, I don't see much overlap between evmlab and the very specific debugging tool discussed here. If you believe there is overlap, please elaborate with some specific and concrete ways it can be used to achieve the same.

arnetheduck commented 6 years ago

The way the tool was explained in the dev call was that you emit a standardized json as you execute stuff - this gives you a complete trace of the block execution - in evmlab, this is used to verify fuzzing results and other stuff apparently.

thus it seems we could get some benefits if we followed the same standardized execution log format, ie compatibility with fuzzers etc.

tersec commented 6 years ago

Implementation-wise, how standardized is this in actual Ethereum clients which otherwise read the blockchain?

Geth effectively has a (quite nonstandard) output format, to which I point above, vaguely JSON-y (but not actually JSON).

arnetheduck commented 6 years ago

not sure actually - just highlighed the tool because it was mentioned during eth devs call, as a way to reproduce consensus issues found by fuzzing - seems like this issue talks about pretty much the same thing, ie record execution so as to provide a replay/test.

zah commented 6 years ago

There is indeed a standardized trace format and implementing it is already part of the specification above (see the links to debug_traceTransaction). It's also required for gaining compatibility with any of the existing EVM debuggers.