Hpmason / retour-rs

A cross-platform detour library written in Rust
Other
99 stars 18 forks source link

Dropping GenericDetour causes a poisoned mutex/rwlock in slice_pool crate for unknown reason #28

Closed mgaertne closed 11 months ago

mgaertne commented 11 months ago

On linux, I face a problem when dropping one of the detours I have put in place. The panic stems from the slice_pool crate where an RwLock or Mutex gets poisoned, I think. I'm not sure why this happens. Basically, I have four functions that I detour into that get replaced whenever a VM is shutdown and re-created. Dropping three of the four detours into that VM works fine, just one causing reproducible problems. I'm not sure what I could do to investigate this further. Any advice would be helpful.

mgaertne commented 11 months ago

To shed some more light on the detours I have in place: In general I hooked into the following functions in a static like way using GenericDetours and RawDetours (for the two C-Variadic functions): GenericDetour<fn(const c_char, unsafe extern "C" fn())> GenericDetour<fn(const c_char, unsafe extern "C" fn())> GenericDetour<fn(mut client_t, const c_char, qboolean)> GenericDetour<fn(mut client_t, mut usercmd_t)> GenericDetour<fn(c_int, const c_char)> GenericDetour<fn(mut client_t, const c_char)> GenericDetour<fn(const c_char, qboolean)> RawDetour RawDetour

For the functions inside the VM, that is re-created from time to, I hook into the following functions, deactivating the hooks when the VM gets shut down, and recreating them whenever a new V is set: GenericDetour<extern "C" fn(mut gentity_t)> GenericDetour<extern "C" fn(c_int, qboolean, qboolean) -> const c_char> GenericDetour<extern "C" fn(mut gentity_t)> GenericDetour< extern "C" fn( mut gentity_t, mut gentity_t, mut gentity_t, mut vec3_t, mut vec3_t, c_int, c_int, c_int, )>

The VM detours I cannot drop immediately after the VM gots shut down, as I figured, so that I deactive the detour, and store them for some time in a Vec respectively. After some time I drop the the old, deactivated detours. That works fine with the detours into the ClientSpawn VM function (the first mut gentity_t function), the ClientConnect V function (c_int, qboolean, qboolean), and the G_Damage function (the last on with the 8 parameters), but not with the G_StartKamikaze detour (the third one in the list of VM function taking a mut gentity_t as parameter). So far, I tried to keep 2 past detours around, and if another gets added, drop it, mixed the order of dropping old detours, kepping just one around, and so. With anything I tried, after the third drop of the G_StartKamikaze detour, reproducible, slice_pool crate panics for a poisoned RwLock.

As a workaround, I extended the Vec for G_StartKamikaze up to 65536 elements, and basically keep them around for as long as the program runs, but that sort of is more a work-around to avoid a reproducible crash. My first hunch was that the similar function signatures between ClientSpawn and G_StartKamikaze might be causing troubles, but I don't know for sure. Putting the function signature into its own type did not solve the problem, though.

What else could I try?

Hpmason commented 11 months ago

I'm having trouble finding RwLock inside either this crate or slice-pool (by searching through $HOME/.cargo/registry/index.crates.io/slice-pool-0.4.1/*) so it should be a Mutex.

Do you have a panic message that may point to where the Mutex is being poisoned (or where it's being caught)?

mgaertne commented 11 months ago

The resulting panic points to this line: https://github.com/darfink/slice-pool-rs/blob/f30a539a69fb8f3f7b18e55c9a3f15ed6642b78d/src/sync/mod.rs#L64

Hpmason commented 11 months ago

Can you get the backtrace (set env var RUST_BACKTRACE=full)?

mgaertne commented 11 months ago

Sure, just noticed I got the line number wrong (but not totally). Here's and unstripped stacktrace:

[shinqlx] Detour dropped sucessfully
[shinqlx] Trying to drop pending G_StartKamikaze detour
thread '<unnamed>' panicked at 'releasing chunk: 2', /home/steam/.cargo/registry/src/index.crates.io-6f17d22bba15001f/slice-pool-0.4.1/src/sync/mod.rs:68:8
stack backtrace:
   0:     0x7fbcaa7bc811 - std::backtrace_rs::backtrace::libunwind::trace::hd8505f8952b1d578
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x7fbcaa7bc811 - std::backtrace_rs::backtrace::trace_unsynchronized::he70ad17ef2414299
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x7fbcaa7bc811 - std::sys_common::backtrace::_print_fmt::h81c1ab30de355b1d
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/sys_common/backtrace.rs:65:5
   3:     0x7fbcaa7bc811 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hb173052fee9a48be
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x7fbcaa7dd7ff - core::fmt::rt::Argument::fmt::h24bdb8fc1cdd6ab9
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/core/src/fmt/rt.rs:138:9
   5:     0x7fbcaa7dd7ff - core::fmt::write::hfb27016af2bf9db2
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/core/src/fmt/mod.rs:1094:21
   6:     0x7fbcaa7b9df7 - std::io::Write::write_fmt::hdbf39a1cb4978ec7
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/io/mod.rs:1714:15
   7:     0x7fbcaa7bc625 - std::sys_common::backtrace::_print::h2b230e23e6fe9050
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x7fbcaa7bc625 - std::sys_common::backtrace::print::hdb8675132c4d8f36
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x7fbcaa7bd8a3 - std::panicking::default_hook::{{closure}}::h0083d1f8688099c0
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/panicking.rs:269:22
  10:     0x7fbcaa7bd634 - std::panicking::default_hook::he0b85b7a539a4b82
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/panicking.rs:288:9
  11:     0x7fbcaa7bde29 - std::panicking::rust_panic_with_hook::hade3b16f8984a4e1
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/panicking.rs:705:13
  12:     0x7fbcaa7bdd27 - std::panicking::begin_panic_handler::{{closure}}::hfe9b4aa9f3378108
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/panicking.rs:597:13
  13:     0x7fbcaa7bcc76 - std::sys_common::backtrace::__rust_end_short_backtrace::h853639dc488abc88
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/sys_common/backtrace.rs:151:18
  14:     0x7fbcaa7bda72 - rust_begin_unwind
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/std/src/panicking.rs:593:5
  15:     0x7fbcaa6292e3 - core::panicking::panic_fmt::h35d15b5e11fdd4fa
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/core/src/panicking.rs:67:14
  16:     0x7fbcaa629813 - core::result::unwrap_failed::ha412526979475a54
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/core/src/result.rs:1651:5
  17:     0x7fbcaa6e330a - core::result::Result<T,E>::expect::h351919382e61ec38
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/core/src/result.rs:1033:23
  18:     0x7fbcaa6e4f62 - slice_pool::sync::ChunkChain::release::he4abc15d63ccb822
                               at /home/steam/.cargo/registry/src/index.crates.io-6f17d22bba15001f/slice-pool-0.4.1/src/sync/mod.rs:66:17
  19:     0x7fbcaa6cdebd - <slice_pool::sync::owned::SliceBox<T> as core::ops::drop::Drop>::drop::h0d21e81ccaf83327
                               at /home/steam/.cargo/registry/src/index.crates.io-6f17d22bba15001f/slice-pool-0.4.1/src/sync/owned.rs:96:5
  20:     0x7fbcaa6d1b47 - core::ptr::drop_in_place<slice_pool::sync::owned::SliceBox<u8>>::he6fad85136e22391
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/core/src/ptr/mod.rs:497:1
  21:     0x7fbcaa6d197c - core::ptr::drop_in_place<retour::alloc::ExecutableMemory>::ha3159e321cffece1
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/core/src/ptr/mod.rs:497:1
  22:     0x7fbcaa699d1f - core::ptr::drop_in_place<retour::arch::detour::Detour>::h86c32adf92519b27
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/core/src/ptr/mod.rs:497:1
  23:     0x7fbcaa698c4a - core::ptr::drop_in_place<retour::detours::generic::GenericDetour<extern "C" fn(*mut shinqlx::quake_types::gentity_s)>>::hbf895e3528cc1b86
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/core/src/ptr/mod.rs:497:1
  24:     0x7fbcaa6bc066 - core::mem::drop::hdb5134514c8e42ac
                               at /rustc/993deaa0bf8bab9dd3eadfd1fbeb093328e95afe/library/core/src/mem/mod.rs:987:24
  25:     0x7fbcaa63f593 - shinqlx::quake_live_engine::QuakeLiveEngine::clean_up_pending_detours::h9a5cfd438f5cf191
                               at /home/steam/temp/shinqlx/src/quake_live_engine.rs:1312:17
  26:     0x7fbcaa63d93a - shinqlx::quake_live_engine::QuakeLiveEngine::initialize_vm::hcb4b840df1c2c3c8
                               at /home/steam/temp/shinqlx/src/quake_live_engine.rs:1222:9
  27:     0x7fbcaa694c9c - shinqlx::hooks::shinqlx_sys_setmoduleoffset::h6ff1f5f7455aac7e
                               at /home/steam/temp/shinqlx/src/hooks.rs:51:23
  28:           0x450478 - <unknown>
  29:           0x4459d0 - <unknown>
  30:           0x43c2f7 - <unknown>
  31:           0x43cf0b - <unknown>
  32:     0x7fbcaa662343 - retour::traits::<impl retour::detours::generic::GenericDetour<fn(A,B) .> Ret>>::call::hae67a9060e40942d
                               at /home/steam/.cargo/registry/src/index.crates.io-6f17d22bba15001f/retour-0.3.0/src/macros.rs:241:11
  33:     0x7fbcaa6438e1 - <shinqlx::quake_live_engine::QuakeLiveEngine as shinqlx::quake_live_engine::SpawnServer>::spawn_server::h6cbedaeeb1b5a7e6
                               at /home/steam/temp/shinqlx/src/quake_live_engine.rs:2058:9
  34:     0x7fbcaa696dcb - shinqlx::hooks::shinqlx_sv_spawnserver::ha35c879123737e80
                               at /home/steam/temp/shinqlx/src/hooks.rs:309:5
  35:           0x437f60 - <unknown>
  36:           0x420ce6 - <unknown>
  37:           0x425331 - <unknown>
  38:           0x450bb9 - <unknown>
  39:           0x4078ee - <unknown>
  40:     0x7fbcaa04618a - __libc_start_call_main
                               at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  41:     0x7fbcaa046245 - __libc_start_main_impl
                               at ./csu/../csu/libc-start.c:381:3
  42:           0x4079f9 - <unknown>
fatal runtime error: failed to initiate panic, error 5

(happens at the third drop of the G_StartKamikaze drop)

Hpmason commented 11 months ago

Seem like the issue is related to the offset of the chunk being incorrect, resulting in a panic because the chunk not being found.

    let index = chunks
      .binary_search_by_key(&offset, |chunk| chunk.offset)
      .expect("releasing chunk");

It looks like there's an issue that may be related to this: https://github.com/darfink/slice-pool-rs/pull/1. Can you try using this branch in your project? Just add this to your Cargo.toml:

[patch.crates-io]
slice-pool = {git = "https://github.com/Gh0u1L5/slice-pool-rs", rev = "12460e7"}
mgaertne commented 11 months ago

Tried the patch. That seems to have solved the problem.

Hpmason commented 11 months ago

Cool, I'm glad that it works! Now, I just to figure out the best way to fix this from the retour crate since we can't publish to crates.io with a patch.

mgaertne commented 11 months ago

slice-pool hasn't been updated in years. Maybe creating a slice-pool2 version, publish it, and put the dependency into retour could be a solution. :)

Hpmason commented 11 months ago

I just pushed v0.3.1 to crates.io, so you should be able to use that now.