Support `ethereum-spec-evm statetest`

SamWilsn commented 6 months ago

This pull request adds support for EELS' ethereum-spec-evm.

Requires an extremely new checkout (at least https://github.com/ethereum/execution-specs/commit/c4f5b8a3ea0b92fcf38f1201255c8919a047f897).

namiloh commented 6 months ago

Oh wow that's awesome!

Can you add it to the docker/Dockerfile too?

SamWilsn commented 6 months ago

I don't know docker that well :sweat_smile:

Our installation is basically:

pip install git+https://github.com/ethereum/execution-specs

namiloh commented 6 months ago

Alright, I'll fix, no worries!

SamWilsn commented 6 months ago

I'll let you know when statetest is available on our master branch!

namiloh commented 6 months ago

Oh right! Please also see this readme: https://github.com/holiman/goevmlab/tree/master/evms/testdata

If you could provide the eels output for those testcases, that would be the next step. I will also play with this today though

holiman commented 6 months ago

How do I actually run the thing? I tried on a debian image, using an apt-installed python3, python3-pip, python3-setuptools, followed by python3 setup.py install, and then

root@f225092db06e:/execution-specs# ethereum-spec-evm -h
Traceback (most recent call last):
  File "/usr/local/bin/ethereum-spec-evm", line 33, in <module>
    sys.exit(load_entry_point('ethereum==0.1.0', 'console_scripts', 'ethereum-spec-evm')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/bin/ethereum-spec-evm", line 25, in importlib_load_entry_point
    return next(matches).load()
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
    module = import_module(match.group('module'))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/usr/local/lib/python3.11/dist-packages/ethereum-0.1.0-py3.11.egg/ethereum_spec_tools/evm_tools/__init__.py", line 12, in <module>
    from .b11r import B11R, b11r_arguments
  File "/usr/local/lib/python3.11/dist-packages/ethereum-0.1.0-py3.11.egg/ethereum_spec_tools/evm_tools/b11r/__init__.py", line 12, in <module>
    from ..utils import get_stream_logger
  File "/usr/local/lib/python3.11/dist-packages/ethereum-0.1.0-py3.11.egg/ethereum_spec_tools/evm_tools/utils.py", line 10, in <module>
    import coincurve
  File "/usr/local/lib/python3.11/dist-packages/coincurve-18.0.0-py3.11-linux-x86_64.egg/coincurve/__init__.py", line 1, in <module>
    from coincurve.context import GLOBAL_CONTEXT, Context
  File "/usr/local/lib/python3.11/dist-packages/coincurve-18.0.0-py3.11-linux-x86_64.egg/coincurve/context.py", line 4, in <module>
    from coincurve.flags import CONTEXT_ALL, CONTEXT_FLAGS
  File "/usr/local/lib/python3.11/dist-packages/coincurve-18.0.0-py3.11-linux-x86_64.egg/coincurve/flags.py", line 1, in <module>
    from ._libsecp256k1 import lib
ModuleNotFoundError: No module named '_cffi_backend'

Note, there was one error during install:

Installed /usr/local/lib/python3.11/dist-packages/remerkleable-0.1.24-py3.11.egg
error: pycryptodome 3.20.0 is installed but pycryptodome==3.9.4 is required by {'eth2spec'}

This is more or less the steps I followed

RUN apt-get install -qy --no-install-recommends  git     
RUN git clone https://github.com/ethereum/execution-specs.git --branch statetests --depth 1
RUN apt-get install -qy --no-install-recommends  python3 python3-pip python3-setuptools
RUN cd execution-specs && python3 setup.py install

SamWilsn commented 6 months ago

You have to use pip. Calling python setup.py directly is deprecated.

If you want to split the clone and the install into two steps, it should look something like:

git clone https://github.com/ethereum/execution-specs.git --branch statetests --depth 1
pip install ./execution-specs

holiman commented 6 months ago

If I do that, then I get

#12 [9/9] RUN pip install ./execution-specs
#12 3.502 error: externally-managed-environment
#12 3.502 
#12 3.502 × This environment is externally managed
#12 3.502 ╰─> To install Python packages system-wide, try apt install
...

SamWilsn commented 6 months ago

Generated the testdata and took a stab at the Dockerfile. While we don't entirely barf on the test inputs, I very much cannot promise the outputs are correct :rofl:

holiman commented 6 months ago

Sweet! So, the traces/roots allow us to tune the 'shim' for eels, in eels.go. The first error is

prev:           both: {"depth":2,"pc":47,"gas":3161521,"op":80,"opName":"POP","stack":["0x1"]}
diff:         besuvm: {"depth":2,"pc":48,"gas":3161519,"op":93,"opName":"TSTORE","stack":[]}
diff:         eelsvm: {"depth":1,"pc":1751,"gas":50352,"op":80,"opName":"POP","stack":["0x17c3328bc4714dc8c7a1910f3768b172314d9d00","0x0"]}

But actually, eels does show the pc=48 output line too:

{"pc":47,"op":80,"gas":"0x303db1","gasCost":"0x2","memSize":0,"stack":["0x1"],"depth":2,"refund":0,"opName":"POP"}
{"pc":48,"gas":"0x303daf","gasCost":"0x0","memSize":0,"stack":[],"depth":2,"refund":0,"opName":"InvalidOpcode","error":"InvalidOpcode"}
{"pc":1751,"op":80,"gas":"0xc4b0","gasCost":"0x2","memSize":1504,"stack":["0x17c3328bc4714dc8c7a1910f3768b172314d9d00","0x0"],"depth":1,"refund":0,"opName":"POP"}

So, what happens here is that eels does not provide the op which was invalid, hence it is parsed as 0, which is the STOP opcode. And STOP is filtered away, due to differences in how implicit stops at end of code is treated by clients [1].

This appears to be the only failure surfaced by the reference outputs, which is very good! Unfortunately, it's not something that can be fixed on the goevmlab-side, at least not without quite a lot of work (namely: go through all other vms and zero out the opcode on errors)

[1]: When geth encounters end of code, it executes a STOP. Other clients do not output anything for that. Hence, goevmlab strips all STOP.

holiman commented 6 months ago

I was able to simplify the output-processing a bit, and now all tests are enabled on the "reference outputs". Once the inclusion of the op on error is fixed, the next step will be to actually run the fuzzer

holiman commented 6 months ago

I'm actually able to run the fuzzer for a while -- here's geth vs eels. After a while though, it hits the error described above

INFO [03-20|09:35:28.048] Executing                                tests=24 time=24.002s test/s=1.0 "avg steps"=1682.9 global=639
INFO [03-20|09:35:28.048] Stats gethbatch-0                        execSpeed=44.3ms  longest=370.281184ms count=25
INFO [03-20|09:35:28.048] Stats eelsbatch-0                        execSpeed=548ms   longest=8.648526777s count=24
...
Consensus error
Testcase: /fuzztmp/00000006-mixed-0.json
- gethbatch-0: ./gethbatch-0-output.jsonl
  - command: /gethvm --json --noreturndata --nomemory statetest
- eelsbatch-0: ./eelsbatch-0-output.jsonl
  - command: /ethereum-spec-evm statetest --json --noreturndata --nomemory
-------
prev:           both: {"depth":1,"pc":221,"gas":15908830,"op":80,"opName":"POP","stack":["0x0"]}
diff:    gethbatch-0: {"depth":1,"pc":222,"gas":15908828,"op":249,"opName":"opcode 0xf9 not defined","stack":[]}
diff:    eelsbatch-0: {"stateRoot":"0x450a084a89ad32a11758d0359e92d901ff1ad67334503e6f03cdec554f4c1b0c"}

holiman commented 6 months ago

Found another flaw. Doing --skiptrace, where we do not actually check the per-op output, just the stateroot, it gets stuck

/generic-fuzzer   --eelsbatch=$EELS_BIN --gethbatch=$GETH_BIN --outdir /fuzztmp/  --fork Cancun --skiptrace
...
INFO [03-20|09:40:30.417] Executing                                tests=0 time=40.000s test/s=0.0 "avg steps"=0.0 global=666
INFO [03-20|09:40:30.417] Stats geth-0                             execSpeed=2.5ms longest=50.582954ms count=1
INFO [03-20|09:40:30.417] Stats eelsbatch-0                        execSpeed=0s    longest=0s          count=0

That seems to be a problem with eelsbatch, not eels though. I think I know what the problem is, it's something I just fixed the other day in geth too.

When in "skip trace" mode, (a.k.a speedMode), we still read from stderr. But goevmlab does not supply the --json switch, so the { "stateroot": ... is not output on stderr. There is json output on stdout, but we're not looking at it, though.

So the copyUntilEnd just doesn't get anything. The easiest fix is if you make it always output the stateroot on stderr, which is how I fixed geth here https://github.com/ethereum/go-ethereum/pull/29290

holiman commented 6 months ago

Using the simpleops engine, which typically won't trigger errors but mosly do arithmetics, I've been able to run it for an hour:

root@6c9a43bcd06f:/#  /generic-fuzzer  --outdir=/fuzztmp --gethbatch=$GETH_BIN --eelsbatch=$EELS_BIN --engine simpleops 
...
INFO [03-20|11:29:51.853] Executing                                tests=400 time=1h0m40.002s test/s=0.1 "avg steps"=24822.6 global=1609
INFO [03-20|11:29:51.853] Stats gethbatch-0                        execSpeed=360.2ms longest=592.013388ms  count=401
INFO [03-20|11:29:51.853] Stats eelsbatch-0                        execSpeed=8.8678s longest=10.390815251s count=400

It's extremely slow, averaging ~9s where geth takes .36s, but I guess very long sections (average 24822 steps) of arithmetics is where it performs worst per gas.

holiman commented 6 months ago

Some general observations:

When I run statetetsts, my feeling is that the performance degrades over time. Both in eelsbatch and eels standalone mode. Seems strange though, don't see how non-batched mode could degrade, because the interpreter exits between runs.
https://github.com/ethereum/execution-specs/blob/statetests/src/ethereum_spec_tools/evm_tools/t8n/evm_trace.py#L189 the is_instance checks seem expensive. Perhaps move OpStart to the top, because that's the most common one?
Similarly, https://github.com/ethereum/execution-specs/blob/statetests/src/ethereum_spec_tools/evm_tools/t8n/evm_trace.py#L151, change if isinstance(evm, EvmWithReturnData) and trace_return_data: into if trace_return_data and isinstance(evm, EvmWithReturnData) :

holiman commented 6 months ago

After 1m40s

INFO [03-20|13:07:40.579] Executing                                tests=14 time=1m40.002s test/s=0.1 "avg steps"=12711.2 global=2037
INFO [03-20|13:07:40.579] Stats gethbatch-0                        execSpeed=182.5ms longest=465.697344ms count=15
INFO [03-20|13:07:40.579] Stats eels-0                             execSpeed=3.5788s longest=7.755425286s count=14

After 2m40s:

INFO [03-20|13:08:40.578] Executing                                tests=23 time=2m40.001s test/s=0.1 "avg steps"=17191.4 global=2046
INFO [03-20|13:08:40.578] Stats gethbatch-0                        execSpeed=231.7ms longest=465.697344ms count=24
INFO [03-20|13:08:40.578] Stats eels-0                             execSpeed=4.7619s longest=7.755425286s count=23

After 3m40s :

INFO [03-20|13:09:40.578] Executing                                tests=31 time=3m40.001s test/s=0.1 "avg steps"=19755.9 global=2054
INFO [03-20|13:09:40.578] Stats gethbatch-0                        execSpeed=279.8ms longest=465.697344ms count=32
INFO [03-20|13:09:40.578] Stats eels-0                             execSpeed=5.6199s longest=7.755425286s count=31

my feeling is that the performance degrades over time

So, I think I know what the problem here is. For averaging, I goevmlab doesn't use a real average, rather a sliding window average.

0.95*current + 0.05*float64(sample)

So it is 95% based on previous value and only 5% looks at the most recent sample. So if we're only done a few tens of runs, the "average" is still heavily affected by the initial value of 0.

So that's a red herring that can be ignored (in this context)

holiman commented 6 months ago

Hm. Do you collect traces interally before emitting them? https://github.com/ethereum/execution-specs/blob/statetests/src/ethereum_spec_tools/evm_tools/t8n/evm_trace.py#L207

I'd recommend against doing so. The reason we use jsonl instead of json is to allow the client under test to not collect tens of thousands of lines of output in memory, but just stream it out and forget about it.

holiman commented 6 months ago

Awesome! I'm running this on my machine now, I'll let it run geth vs eels for a while then merge. Thanks for this!

holiman commented 6 months ago

With all engines enabled:

INFO [03-21|08:02:14.411] Executing                                tests=2763 time=1h0m40.000s     test/s=0.8 "avg steps"=3023.7 global=2763
INFO [03-21|08:02:14.411] Stats gethbatch-0                        execSpeed=66.4ms  longest=2.459574602s    count=2764
INFO [03-21|08:02:14.411] Stats eelsbatch-0                        execSpeed=841ms   longest=1m47.133973488s count=2763

:heavy_check_mark:

holiman commented 6 months ago

Would be nice to remove be able to remove --branch statetests from the dockerfile before merging.

holiman commented 6 months ago

Wow, the --json really takes a toll. Here's executing one of the arithmetic tests twice, first with --json then without:

root@f4371a5823ca:/# time yes /fuzztmp/00000006-mixed-2.json | head -n2 | /ethereum-spec-evm statetest --json --noreturndata --nomemory > /dev/null 2>&1 

real    0m18.130s
user    0m17.637s
sys 0m0.209s
root@f4371a5823ca:/# time yes /fuzztmp/00000006-mixed-2.json | head -n2 | /ethereum-spec-evm statetest  --noreturndata --nomemory > /dev/null 2>&1 

real    0m0.758s
user    0m0.690s
sys 0m0.056s

18s vs 0.8s.

holiman commented 6 months ago

New docker image pushed

INFO [03-21|11:16:49.253] Executing                                tests=564 time=32.000s test/s=17.6 "avg steps"=1992.8 global=564
INFO [03-21|11:16:49.253] Stats gethbatch-0                        execSpeed=134.9ms longest=1.001198525s  count=225
INFO [03-21|11:16:49.253] Stats eelsbatch-0                        execSpeed=2.706s  longest=14.363253487s count=17
INFO [03-21|11:16:49.253] Stats nethbatch-0                        execSpeed=88.2ms  longest=2.053624955s  count=226
INFO [03-21|11:16:49.253] Stats besubatch-0                        execSpeed=73.2ms  longest=7.471210202s  count=148
INFO [03-21|11:16:49.253] Stats erigonbatch-0                      execSpeed=155.4ms longest=1.24729477s   count=178
INFO [03-21|11:16:49.253] Stats nimbus-0                           execSpeed=707.5ms longest=5.90771302s   count=34
INFO [03-21|11:16:49.253] Stats evmone-0                           execSpeed=148.1ms longest=883.202271ms  count=223
INFO [03-21|11:16:49.253] Stats revm-0                             execSpeed=233.1ms longest=3.831558802s  count=80

All 8 clients!

holiman / goevmlab

Support `ethereum-spec-evm statetest` #131