ThePhD / sol2

Sol3 (sol2 v3.0) - a C++ <-> Lua API wrapper with advanced features and top notch performance - is here, and it's great! Documentation:
http://sol2.rtfd.io/
MIT License
4.06k stars 492 forks source link

Segfault from protected_function args push #1579

Open checkraisefold opened 5 months ago

checkraisefold commented 5 months ago

This issue is seemingly very hard to debug, and I'm unable to produce a test case for this (considering this usage of sol2 is pretty out of the ordinary and this is the only instance of this crash I'm able to find) The code causing the crash: https://github.com/jpxs-intl/RosaServer/blob/main/RosaServer/hooks.cpp#L1293 where run is a sol2 protected_function, changeable from the Lua environment. This issue also doesn't persist in a debug build, and only happens in release mode. I've gathered the debug info with GCC -g compiler option.

Call stack:

librosaserver.so!sol::stack::stack_detail::undefined_metatable::undefined_metatable(sol::stack::stack_detail::undefined_method_func umf, const char * k, lua_State * l, sol::stack::stack_detail::undefined_metatable * const this) Line 786    C++
librosaserver.so!sol::stack::unqualified_pusher<sol::detail::as_pointer_tag<Vehicle>, void>::push_keyed<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(Vehicle * obj, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > & k, lua_State * L) Line 214   C++
librosaserver.so!sol::stack::unqualified_pusher<sol::detail::as_pointer_tag<Vehicle>, void>::push<Vehicle*>(Vehicle *&& arg, lua_State * L) Line 225    C++
librosaserver.so!sol::stack::push<sol::detail::as_pointer_tag<Vehicle>, Vehicle*, , void>(lua_State*, Vehicle*&&)(Vehicle *&& arg, lua_State * L) Line 897  C++
librosaserver.so!sol::stack::unqualified_pusher<Vehicle*, void>::push<Vehicle*>(lua_State * L) Line 366 C++
librosaserver.so!sol::stack::push<Vehicle*>(Vehicle *&& t, lua_State * L) Line 878  C++
librosaserver.so!sol::stack::stack_detail::push_reference<Vehicle*, Vehicle*>(Vehicle *&& arg, lua_State * L) Line 936  C++
librosaserver.so!sol::stack::push_reference<Vehicle*>(Vehicle *&& t, lua_State * L) Line 943    C++
librosaserver.so!sol::stack::multi_push_reference<char const (&) [18], Vehicle*>(const char (&)[18] t, lua_State * L) Line 971  C++
librosaserver.so!sol::basic_protected_function<sol::basic_reference<false>, false, sol::basic_reference<false> >::call<, char const (&) [18], Vehicle*>(char const (&) [18], Vehicle*&&) const(const sol::basic_protected_function<sol::basic_reference<false>, false, sol::basic_reference<false> > * const this) Line 228 C++
librosaserver.so!sol::basic_protected_function<sol::basic_reference<false>, false, sol::basic_reference<false> >::operator()<char const (&) [18], Vehicle*>(const sol::basic_protected_function<sol::basic_reference<false>, false, sol::basic_reference<false> > * const this) Line 212    C++
>   librosaserver.so!Hooks::deleteVehicle(int vehicleID) Line 1293  C++

I did some debugging, and while the &Engine::vehicles[vehicleID] reference passed to run is perfectly valid and readable memory, the reference seems to be corrupted somehow after the several std::forward parameter unpacks. By the time stack::push_reference is hit, the Vehicle reference is somehow nulled out. I could just be wrong though, seeing as the segfault is for some reason in.. the constructor for undefined_metatable. See screenshots: devenv_t4ZdRj7Zzt devenv_dikVq5T5Fm

The prior run call in that function doesn't undergo a similar segfault and runs completely fine.

Compiler is GCC, being built with CMake. Using VS2022 as an IDE w/WSL remote development. Version of sol2 is latest develop branch, but is reproducible on the latest master as well. Attempted using SOL_ALL_SAFETIES_ON just to see if it'd do anything considering it worked in a debug build and had no dice.

roman-orekhov commented 2 months ago

There's a big problem with Lua references: they are not thread safe. No lua_lock used when doing luaL_ref and luaL_unref. So unless you protect Lua registry usage with locks yourself, you will get strange crashes while creating/destroying sol references from several threads. You can debug it with looking at the registry with my script. Note that the registry sometimes gets corrupted well before the crash. Also note that using the references (lua_rawgeti) seems thread safe.

checkraisefold commented 2 months ago

There's a big problem with Lua references: they are not thread safe. No lua_lock used when doing luaL_ref and luaL_unref. So unless you protect Lua registry usage with locks yourself, you will get strange crashes while creating/destroying sol references from several threads. You can debug it with looking at the registry with my script. Note that the registry sometimes gets corrupted well before the crash. Also note that using the references (lua_rawgeti) seems thread safe.

The problem is that the 'RosaServer' project this thing is happening in is entirely single threaded except for the thread that handles console input, and it just uses detours on the game dedicated server it's a mod for. So there's no cross-thread refs going on here