lgi-devs / lgi

Dynamic Lua binding to GObject libraries using GObject-Introspection
MIT License
434 stars 68 forks source link

C calling into Lua directly #92

Open gregier opened 9 years ago

gregier commented 9 years ago

I'm using g_object_new() directly to create a Lua object, however this only works from the thread the Lua state was created on. Is there a way to make calls into Lua from a different thread?

The code causing the issues: https://git.gnome.org/browse/libpeas/tree/loaders/lua5.1/peas-plugin-loader-lua.c#n212

pavouk commented 9 years ago

The issues you are seeing look like random Lua state internals corruptions, right? If my guess is correct, it might be because of Lua state is single-threaded only but Lua does not enforce this in any way (i.e. there is no built-in giant lock as e.g. in Python).

lgi contains internal state lock, so calling into Lua from C code through signals or virtual implemented in Lua should be safe. However, it seems that your issue might be because you are using Lua state 'manually' while some other code use it though lgi-invoked callbacks.

It would be possible to export lgi's lock to be used from C code, something like this would be needed on your side:

After Lua state is created, lg. loaded but no access from other threads is done:

void (*enter_state)(void), (*leave_state)(void);
if (!peas_lua_utils_require(L, "lgi"))
    return FALSE;
lua_getfield(L, -1, "enter");
enter_state = lua_touserdata(L, -1);
lua_getfield(L, -2, "leave");
leave_state = lua_touserdata(L, -1);
lua_pop(L, 3);

And then every lua_State reference which might occur concurrently from another thread would have to be wrapped with this:

enter_state();

// Perform something with L, e.g. peas_plugin_loader_lua_create_extension()

leave_state();

Of course, lgi.enter and lgi.leave do not exist yet, but exposing them would be really easy. However, I'm not sure that I understood the issue and your use case right, so it would be nice to hear from you before adding this code.

gregier commented 9 years ago

I'm not getting corruption because I was using a GRecMutex to protect the created lua_State. Instead I was getting a deadlock because g_object_new() was calling into lgi but the state lock was still locked as lgi hadn't released it yet.

Adding the enter and leave functions fixes the issue, however I still require a lock in the loader. Otherwise two threads would be working on the lua_State at the same time. This ends up slowing down the tests by quite a bit, hell Python ended up being over twice as fast and that has the GIL!

I was looking at using the package lock (lgi.core.registerlock), however that is global. While this solution works, is there another idea that would avoid the double locking?

gregier commented 9 years ago

The commit for adding the lock functions: https://github.com/gregier/lgi/commit/103118dc82de72905ee2152360afe675ca0c72be

If this method is prefered I can create a pull request.

pavouk commented 9 years ago

I think that the terrible overhead is not only because of double-locking, but because of overhead of calling enter()/leave() as Lua functions. It might be possible to avoid that by exporting lgi_state_get_lock() result as lgi.lock (lightuserdata) and exporting addresses of lgi_state_enter/leave as lgi.enter/leave (lightuserdata too)

In your code, when you instantiate lua_State and load lgi, you retrieve and remember these 3 values (as C pointers) side-by-side with the lua_State. You can then use this locking construct (enter(lock) and leave(lock)) instead of your own lua_State protection by GRecMutex, because internal implementation of enter/leave do not use lua_State at all.

Hopefully, this should lower the overhead back to sane values.

I'll try to cook this up today evening.

pavouk commented 9 years ago

Pushed. Please let me know whether it helped - no hurry though :-)

gregier commented 9 years ago

Sorry for the delay, I just tested the new API and it works perfectly and quickly!

Thanks!

gregier commented 9 years ago

I've actually started to run into some issues with this. When multiple threads are running and the LGI lock is released while calling g_type_is_a() the stack is corrupted and the incorrect object is returned to the plugin loader. Any idea how I can get this working?

https://bugzilla.gnome.org/show_bug.cgi?id=741859