paradigmxyz / revmc

JIT and AOT compiler for the Ethereum Virtual Machine, built on Revm.
Apache License 2.0
205 stars 22 forks source link

Segfault #61

Closed 0xDmtri closed 3 months ago

0xDmtri commented 3 months ago

After updating the library I started getting SIGSEGV. The commit cec178dcef3ce809c6d567dca9f8122464d5a53d still works just fine. It stopped working when REVM was bumped to 12.1.

I decided to debug it via GDB, attaching logs:

P.S. libdexy is the bytecode I statically linked via:

revmc_context::extern_revmc! {
    fn libdexy;
}
(gdb) backtrace
#0  0x0000555555ad0bf9 in libdexy ()

#1  0x0000555555a82222 in revmc_context::EvmCompilerFn::call (self=..., stack=..., stack_len=..., ecx=0x7bfee2ff6da8)
    at /root/.cargo/git/checkouts/revmc-cb5494447e430bb4/4ad65bd/crates/revmc-context/src/lib.rs:320

#2  0x0000555555a81d4a in revmc_context::EvmCompilerFn::call_with_interpreter (self=..., interpreter=0x7bfee1c7c400, host=...)
    at /root/.cargo/git/checkouts/revmc-cb5494447e430bb4/4ad65bd/crates/revmc-context/src/lib.rs:271

#3  0x0000555555a8202a in revmc_context::EvmCompilerFn::call_with_interpreter_and_memory (self=..., interpreter=0x7bfee1c7c400, memory=0x7bfee2ff73c0, host=...)
    at /root/.cargo/git/checkouts/revmc-cb5494447e430bb4/4ad65bd/crates/revmc-context/src/lib.rs:247

#4  0x0000555555ad7e9a in strategy::simulator::register_handler::{closure#0}<revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>> (frame=0x7bfee1c80000, memory=0x7bfee2ff73c0,
    tables=0x7bfee2ff9e78, context=0x7bfee2ff9af0) at crates/strategy/src/simulator/mod.rs:107

#5  0x0000555555bfb732 in revm::handler::handle_types::execution::ExecutionHandler<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>>::execute_frame<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>> (self=0x7bfee2ff9dc8, frame=0x7bfee1c80000, shared_memory=0x7bfee2ff73c0,
    instruction_tables=0x7bfee2ff9e78, context=0x7bfee2ff9af0) at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/revm-12.1.0/src/handler/handle_types/execution.rs:175

#6  0x0000555555c1b6bc in revm::handler::Handler<revm::context::Context<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>>, strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>>::execute_frame<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>> (
    self=0x7bfee2ff9d00, frame=0x7bfee1c80000, shared_memory=0x7bfee2ff73c0, context=0x7bfee2ff9af0) at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/revm-12.1.0/src/handler.rs:118

#7  0x0000555555aea394 in revm::evm::Evm<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>>::run_the_loop<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>> (self=0x7bfee2ff9af0, first_frame=...) at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/revm-12.1.0/src/evm.rs:95

#8  0x0000555555aebec8 in revm::evm::Evm<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>>::transact_preverified_inner<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>> (self=0x7bfee2ff9af0, initial_gas_spend=23192)
    at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/revm-12.1.0/src/evm.rs:371

#9  0x0000555555aec71d in revm::evm::Evm<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>>::transact<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>> (self=0x7bfee2ff9af0) at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/revm-12.1.0/src/evm.rs:236

#10 0x0000555555aeb584 in revm::evm::Evm<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>>::transact_commit<strategy::simulator::ExternalContext, revm::db::in_memory_db::CacheDB<ethmatrix::duality::Duality>> (self=0x7bfee2ff9af0) at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/revm-12.1.0/src/evm.rs:46

#11 0x0000555555c19678 in strategy::simulator::dexy::evaluate_arb_revenue (arb_amount=..., next_block=0x7fffffffc0d0, duality=..., cycle=0x7bfeb5e12040, victim_txs=...)
    at crates/strategy/src/simulator/lil_dexy.rs:216

#12 0x0000555555adbef5 in strategy::simulator::dexy::find_optimal_inputs::{closure#2} () at crates/strategy/src/simulator/lil_dexy.rs:105

What we can understand from the backtrace (imho):

  1. Frame 0: The segmentation fault occurred in the libdexy macro-generated function.
  2. Frame 1: The libdexy function was called by revmc_context::EvmCompilerFn::call.
  3. Frame 2: This function call occurred within revmc_context::EvmCompilerFn::call_with_interpreter.
  4. Frame 3: Which was called by revmc_context::EvmCompilerFn::call_with_interpreter_and_memory.
  5. Frame 4-11: The subsequent frames show the nested function calls within the revmc_context, strategy::simulator, and revm crates leading up to the segmentation fault.
  6. Frame 12: The last frame shows a closure in strategy::simulator::lil_dexy::find_optimal_inputs.

Locals info:

(gdb) frame 1

#1  0x0000555555a82222 in revmc_context::EvmCompilerFn::call (self=..., stack=..., stack_len=..., ecx=0x7bfee2ff6da8) at /root/.cargo/git/checkouts/revmc-cb5494447e430bb4/4ad65bd/crates/revmc-context/src/lib.rs:320
320             (self.0)(

(gdb) info locals
No locals.

(gdb) info args
self = revmc_context::EvmCompilerFn (0x555555ac33b0 <libdexy>)
stack = core::option::Option<&mut revmc_context::EvmStack>::Some(0x7bfee1c72600)
stack_len = core::option::Option<&mut usize>::Some(0x7bfee1c7c520)
ecx = 0x7bfee2ff6da8

(gdb) print self
$1 = revmc_context::EvmCompilerFn (0x555555ac33b0 <libdexy>)

(gdb) print self.0
$2 = (*mut fn (*mut revm_interpreter::gas::Gas, *mut revmc_context::EvmStack, *mut usize, *mut revm_primitives::env::Env, *mut revm_interpreter::interpreter::contract::Contract, *mut revmc_context::EvmContext) -> revm_interpreter::instruction_result::InstructionResult) 0x555555ac33b0 <libdexy>

Mem dump for Frame 1:

Arg `stack`:

(gdb) x/16x 0x7bfee1c72600
0x7bfee1c72600: 0x00000024      0x00000000      0x00000000      0x00000000
0x7bfee1c72610: 0x00000000      0x00000000      0x00000000      0x00000000
0x7bfee1c72620: 0x00000064      0x00000000      0x00000000      0x00000000
0x7bfee1c72630: 0x00000000      0x00000000      0x00000000      0x00000000

Arg `stack_len`:

(gdb) x/16x 0x7bfee1c7c520
0x7bfee1c7c520: 0x0000000000000000      0x0000000000000000
0x7bfee1c7c530: 0x0000000000000008      0x0000000000000000
0x7bfee1c7c540: 0x0000000000000000      0x00007bfee2ffea58
0x7bfee1c7c550: 0x00007bfee2ffea58      0x00000000000cbfe8
0x7bfee1c7c560: 0x00007bfee2ffea58      0x00007bfee1c7c400
0x7bfee1c7c570: 0x00007bfee2ffef50      0x00000000000cbfe8
0x7bfee1c7c580: 0x00007bfee2ffea70      0x000055555745cc08
0x7bfee1c7c590: 0x00007bfee1c87000      0x0000000000000f76

Arg `ecx`:

(gdb) x/16x 0x7bfee2ff6da8
0x7bfee2ff6da8: 0x00007bfee2ff9af0      0x00005555573a3338
0x7bfee2ff6db8: 0x0000000000000001      0x0000000000000000
0x7bfee2ff6dc8: 0x0000000000000000      0x00007bfee1c7c4d8
0x7bfee2ff6dd8: 0x00007bfee1c7c400      0x00007bfee1c7c5d0
0x7bfee2ff6de8: 0x00007bfee1c7c548      0x00007bfee1c7c528
0x7bfee2ff6df8: 0x00007fff6df10000      0x00007bfee2ff9af0
0x7bfee2ff6e08: 0x00005555573a3338      0x0000000000000001
0x7bfee2ff6e18: 0x0000000000000000      0x0000000000000000

Attempted to examine the stack reference but encountered issues with direct dereferencing, hence did mem inspection via x/16a command and concluded the following:

Place in code where it panics: https://github.com/paradigmxyz/revmc/blob/main/crates/revmc-context/src/lib.rs#L271

DaniPopes commented 3 months ago

Thanks for the detailed issue! Would like to confirm that this happens on a clean build (cargo clean) and with updated and equal versions for all packages, since the code is very sensitive about all of this.

I'd appreciate if you could either provide the bytecode and relevant inputs and environment to reproduce this, or, if not possible, see if you can bisect the commit where this segfault occurs, the range you gave is https://github.com/paradigmxyz/revmc/compare/cec178dcef3ce809c6d567dca9f8122464d5a53d...main.

0xDmtri commented 3 months ago

Thanks for the detailed issue! Would like to confirm that this happens on a clean build (cargo clean) and with updated and equal versions for all packages, since the code is very sensitive about all of this.

I'd appreciate if you could either provide the bytecode and relevant inputs and environment to reproduce this, or, if not possible, see if you can bisect the commit where this segfault occurs, the range you gave is cec178d...main.

Can confirm that I did cargo clean as well as tested with debug and release profiles. About deps my alloy-primitives is 7.7 while revmc's is at 7.1.

Im gonna make a repro example repo ser, sounds good. Bisecting commit is also possible yh

0xDmtri commented 3 months ago

@DaniPopes Heres an easy repro that produces Segfault (well at least for me hehe).

https://github.com/0xDmtri/segfault-revmc-repro

0xDmtri commented 3 months ago

Ok, wild fact. It bloody works on my M1 pro mac chip... But it doesnt on my AMD box... Mental.

I think im going insane already, spent 2 weeks debugging all these thing.

Probably related to Revm optimizations? As its the common crate between Reth and Revmc. But then why would it work without opts for Reth but not for Revmc? I am even more confused now...

0xDmtri commented 3 months ago

lscpu:

Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          48 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   32
  On-line CPU(s) list:    0-31
Vendor ID:                AuthenticAMD
  BIOS Vendor ID:         Advanced Micro Devices, Inc.
  Model name:             AMD Ryzen 9 7950X3D 16-Core Processor
    BIOS Model name:      AMD Ryzen 9 7950X3D 16-Core Processor           Unknown CPU @ 4.2GHz
    BIOS CPU family:      107
    CPU family:           25
    Model:                97
    Thread(s) per core:   2
    Core(s) per socket:   16
    Socket(s):            1
    Stepping:             2
    Frequency boost:      enabled
    CPU(s) scaling MHz:   58%
    CPU max MHz:          5758.5928
    CPU min MHz:          3000.0000
    BogoMIPS:             8384.18
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good
                           amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm
                           cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_
                          l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma
                          clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local avx512_bf16 clzero irperf xsaveerptr
                           rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ct
                          rl avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succor smca fsrm flush_l1d amd_lbr_pmc_f
                          reeze
Virtualization features:
  Virtualization:         AMD-V
Caches (sum of all):
  L1d:                    512 KiB (16 instances)
  L1i:                    512 KiB (16 instances)
  L2:                     16 MiB (16 instances)
  L3:                     128 MiB (2 instances)
NUMA:
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-31
Vulnerabilities:
  Gather data sampling:   Not affected
  Itlb multihit:          Not affected
  L1tf:                   Not affected
  Mds:                    Not affected
  Meltdown:               Not affected
  Mmio stale data:        Not affected
  Reg file data sampling: Not affected
  Retbleed:               Not affected
  Spec rstack overflow:   Mitigation; safe RET, no microcode
  Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:             Mitigation; Enhanced / Automatic IBRS; IBPB conditional; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
  Srbds:                  Not affected
  Tsx async abort:        Not affected
DaniPopes commented 3 months ago

I can reproduce, but only when optimizing the bytecode, and if I don't add printf at every instruction. Not sure what's going on, will investigate further

0xDmtri commented 3 months ago

I can reproduce, but only when optimizing the bytecode, and if I don't add printf at every instruction. Not sure what's going on, will investigate further

Thats wicked mate haha, if you need any help def lmk, im here to assist!

DaniPopes commented 3 months ago

Maybe an LLVM miscompilation, or some undefined behavior related to MSTORE builtin, because adding noinline fixes the segfault.

0xDmtri commented 3 months ago

Maybe an LLVM miscompilation, or some undefined behavior related to MSTORE builtin, because adding noinline fixes the segfault.

Interesting. I wonder whats causing it in Reth

DaniPopes commented 3 months ago

Don't know, it's unrelated, and a known issue.