ThePhD / sol2

Sol3 (sol2 v3.0) - a C++ <-> Lua API wrapper with advanced features and top notch performance - is here, and it's great! Documentation:
http://sol2.rtfd.io/
MIT License
4.16k stars 504 forks source link

Forcing objects to be garbage collected #1391

Open judicaelclair opened 2 years ago

judicaelclair commented 2 years ago

I am using sol2 to expose functionality of plugins (dynamic shared objects) that I can reload during runtime. For instance, I can have plugin A (libplugin_a.so), which has the following code:

static void foo(unsigned arg) {
  // does some stuff
}

void bind_to_lua_state(sol::state& s) {
  auto t   = s.template get_or_create<sol::table>("plugin_a_namespace");
  t["lambda"] = []() { /* do some other stuff */ };
  t["foo_raw_pointer"] = &foo;
  t["foo_with_default_arg"] = sol::overload([]() { foo(1); }, &foo);
  t["foo_as_std_fun"] = std::function<void(unsigned)>(&foo);
}

void unbind_from_lua_state(sol::state& s) {
  s["plugin_a_namespace"] = sol::nil;
}

Then, the typical execution flow is as follows: 1 - load plugin A (via dlopen()) - loads the corresponding dynamic shared object as not previously resident. 2 - call bind_to_lua_state(). 3 - execute Lua code that invokes plugin A functionality (e.g. plugin_a_namespace.foo_raw_pointer(1)). 4 - call unbind_from_lua_state(). 5 - call sol::state_view::collect_garbage(). 6 - unload plugin A (via dlclose()) - unloads the corresponding dynamic shared object as reference count == 0. 7 - repeat steps 1-6 indefinitely.

Then, I often get an address sanitizer (ASAN) error during garbage collection:

AddressSanitizer:DEADLYSIGNAL
=================================================================
==77129==ERROR: AddressSanitizer: SEGV on unknown address 0x7f9fafacd6a0 (pc 0x7f9fafacd6a0 bp 0x7fffb2943af0 sp 0x7fffb2943988 T0)
==77129==The signal is caused by a READ memory access.
==77129==Hint: PC is at a non-executable region. Maybe a wild jump?
    #0 0x7f9fafacd6a0  (/libplugin_a.so+0xd66a0) (BuildId: a62a67b00c12ac78ffd2d28d15c489ed5f094230)
    #1 0x7f9fc0ea540e in luaD_precall /lua/ldo.c:576:7
    #2 0x7f9fc0e4cfea in ccall /lua/ldo.c:616:13
    #3 0x7f9fc0df8cf7 in luaD_callnoyield /lua/ldo.c:636:3
    #4 0x7f9fc0e212e1 in dothecall /lua/lgc.c:895:3
    #5 0x7f9fc0dbccc1 in luaD_rawrunprotected /lua/ldo.c:144:3
    #6 0x7f9fc0dfa932 in luaD_pcall /lua/ldo.c:934:12
    #7 0x7f9fc0e1f7bc in GCTM /lua/lgc.c:915:14
    #8 0x7f9fc0e2ebbf in runafewfinalizers /lua/lgc.c:934:5
    #9 0x7f9fc0e2cba8 in singlestep /lua/lgc.c:1626:16
    #10 0x7f9fc0e29bcf in luaC_runtilstate /lua/lgc.c:1648:5
    #11 0x7f9fc0f47f76 in fullinc /lua/lgc.c:1708:3
    #12 0x7f9fc0e01722 in luaC_fullgc /lua/lgc.c:1723:5
    #13 0x7f9fc0dfeabc in lua_gc /lua/lapi.c:1149:7
    #14 0x7f9fc318ce29 in sol::state_view::collect_garbage() /sol2/include/sol/state_view.hpp:668:4

When ["foo_with_default_arg"] or t["foo_as_std_fun"] are free'd, their corresponding finalizers are invoked but they are sometimes free'd after the plugin has been reloaded; thereby producing the ASAN error above. By contrast, I do not get any problems with t["foo_raw_pointer"] and t["lambda"] presumably because there are no finalizers to invoke.

Is there a way to guarantee that ["foo_with_default_arg"] and t["foo_as_std_fun"] are immediately garbage collected during step 5 so that plugin A can be safely unloaded?

I tried changing the garbage collection mode to generational change_gc_mode_generational(20, 100) but that didn't help. I also tried individually setting the objects to nil but that didn't help either:

void unbind_from_lua_state(sol::state& s) {
  s["plugin_a_namespace"]["foo_with_default_arg"] = sol::nil;
  s["plugin_a_namespace"]["foo_as_std_fun"] = sol::nil;
  s["plugin_a_namespace"] = sol::nil;
}