luvit / luv

Bare libuv bindings for lua
Apache License 2.0
818 stars 185 forks source link

Sharing the event loop leads to issues with processing foreign handles/cleanup #721

Open rdw-software opened 1 week ago

rdw-software commented 1 week ago

Following-up with https://github.com/luvit/luv/issues/718#issuecomment-2407216958):

The SEGFAULTs mentioned here are related to a specific use case, which seemingly violates a core assumption in the luv code.

Handles ( uv_handle_t) managed by luv are allocated in luv_setup_handle, where they store the luv_handle_t as userdata:

https://github.com/luvit/luv/blob/5eb6543d9df0e9965b1f559c528473dcb4ce5003/src/lhandle.c#L31

https://github.com/luvit/luv/blob/5eb6543d9df0e9965b1f559c528473dcb4ce5003/src/lhandle.c#L50-L55

This works great when all handles are created in Lua and luv is the only user of the event loop. But when an application that embeds luv wants to integrate other libraries (that also support libuv) and then shares the uv_loop_t with luv, things can get a little dicey:

The application might want to use libuv APIs directly (from C and/or the FFI), so a high degree of interoperability would be valuable. Beause luv_loop() is a public API I assumed that making use of the existing event loop was fine, even in the same Lua state. Some popular libraries that support sharing event loops would be libwebsockets and uSockets (used in uWebSockets). The latter is what I use.

I encountered the problem because the library creates several handles that trigger the assertion/crash when processed by luv:

https://github.com/luvit/luv/blob/5eb6543d9df0e9965b1f559c528473dcb4ce5003/src/loop.c#L92-L98

Here the foreign handles were being processed as if their data pointed to a luv_handle_t reference, which it did not. I've since experimented with workarounds that allow skipping the irrelevant handles, but the lifetime/GC logic still choked on them.

I'm not sure if this is a supported use case or if you'd rather see embedders adopt a different approach. Potential solutions:

There could be better ways of handling this. Or maybe the library already supports sharing the loop, but I didn't see how. Any ideas?

zhaozg commented 1 week ago

https://github.com/luvit/luv/blob/5eb6543d9df0e9965b1f559c528473dcb4ce5003/src/luv.h#L118-L122

We provide luv_set_loop for that, the neovim show howto embeds luv, with a external loop. https://github.com/neovim/neovim/blob/c4762b309714897615607f135aab9d7bcc763c4f/src/nvim/lua/executor.c#L600

more info please look at https://github.com/luvit/luv/pull/331 https://github.com/luvit/luv/pull/345

rdw-software commented 1 week ago

Thanks! But the handles aren't treated any differently, nor does luv_set_callback help in this case. As soon as anything not originating with luv is put on the loop, processing it leads to crashes. Regardless of who owns/manages the uv_loop_t:

  1. Someone creates the loop itself (doesn't matter where this is done)
  2. Loop is assigned to luv and will be used to run all of its handles (as expected)
  3. Loop is assigned to uSockets, or some other third-party code for that matter
  4. This code puts additional handles onto the loop, which have no corresponding luv_handle_t
  5. uv.walk() invokes the luv_walk_cb for all handles, including the ones luv didn't create
  6. Crash or assertion failure ensues because the handles aren't recognized as valid by luv
  7. Bonus: They also cause problems when luv later tries to close them (GC/shutdown sequence)

I tried this (somewhat simplified) approach:

uv_loop_t shared_loop;
uv_loop_init(&shared_loop);
luv_set_loop(L, &shared_loop); // Stores the loop in the context (OK)
// Calling luaopen_luv afterwards does make it use the shared loop (OK)

// Assign loop to another library, create requests independently, or what have you
// uv.walk() from Lua, uv_walk from C, or do any other processing of the handles
// --> Will likely crash here unless luv ignores/is made aware of the foreign handles

uv_run(&shared_loop, UV_RUN_DEFAULT); // Doesn't matter where this happens, or if it happens at all (?)

NeoVIM seems to create its own main loop on top of uv_loop_t. I don't know that this approach would be desirable here.

Bilal2453 commented 1 week ago

NeoVIM seems to create its own main loop on top of uv_loop_t. I don't know that this approach would be desirable here.

I haven't looked at anything else yet, but yes indeed that is what Zhaozg is suggesting, drive luv's event loop using the other event loop, I don't believe integrating another event loop into Luv's, or letting it manage handles created by something else is supported nor am I sure if it is a goal. If I remember correctly even libuv has issues handling that last time I looked at integrating cqueues.

Luv still needs to know how to GC a handle (or if not to gc it), and will require other references while operating on a handle, I don't think it is really possible to change how all of this works as of now.

zhaozg commented 1 week ago

Crash or assertion failure ensues because the handles aren't recognized as valid by luv

Please know, use luv to process handler created by luv, if handler create by others lib, that will damage the luv_handle_t object(uv_handle_t->data) wrap mechanism. https://github.com/luvit/luv/blob/master/src/lhandle.c#L19.

rdw-software commented 5 days ago

There's only one event loop (uv_loop_t), which is used by luv and potentially other code. That's the goal, at least.

I've looked through the luv sources some more and with luv_set_loop the GC problem is already solved. Indeed, the only function that should ever encounter the foreign handles is uv.walk() because it processes all the handles that exist on the shared loop (but crashes if any were created externally). That means there's no need for sweeping changes.

I've made it work by replacing uv.walk with a wrapper that "tags" the foreign handles (handle->data = NULL), calls uv_walk from luv, and finally restores the original pointer. That's the simplest and least hacky option I could come up with, but it does require one tiny change to luv itself. The assert in loop.c would have to be turned into a conditional return:

static void luv_walk_cb(uv_handle_t* handle, void* arg) {
  lua_State* L = (lua_State*)arg;
  luv_handle_t* data = (luv_handle_t*)handle->data;

  // Sanity check
  // Most invalid values are large and refs are small, 0x1000000 is arbitrary.
  if(!data || data->ref >= 0x1000000) return; // <--- This is the only change

  lua_pushvalue(L, 1);           // Copy the function
  luv_find_handle(L, data);      // Get the userdata
  data->ctx->cb_pcall(L, 1, 0, 0);  // Call the function
}

This effectively skips any handles that weren't created from within luv, and offloads everything else to the application.

A slightly less cryptic alternative would be to introduce a flag (tag value) that makes the intention clearer:

  // Some constant like this might be defined in luv.h - name and ptr value debatable:
  #define LUVF_EXTERNAL_HANDLE  ((void*)0xDEADBEEF) // Or another "magic" value

With this, an early exit could be used to skip invoking the callback for handles that luv doesn't manage itself:

static void luv_walk_cb(uv_handle_t* handle, void* arg) {
  lua_State* L = (lua_State*)arg;
  luv_handle_t* data = (luv_handle_t*)handle->data;

  if(data == LUVF_EXTERNAL_HANDLE) return; // Ignore external loop participants (no safety guarantees)

  // Sanity check
  // Most invalid values are large and refs are small, 0x1000000 is arbitrary.
  assert(data && data->ref < 0x1000000); // <--- Original assertion still exists

  lua_pushvalue(L, 1);           // Copy the function
  luv_find_handle(L, data);      // Get the userdata
  data->ctx->cb_pcall(L, 1, 0, 0);  // Call the function
}

Would you accept a non-disruptive change like that? I can diff-patch luv, but I'd rather PR it and avoid future headaches.

Bilal2453 commented 4 days ago

I feel like it makes sense for uv.walk to not just crash on you if it encounters a foreign handle, and rather just handle what it knows to handle. I think I prefer removing the assertion solution.