stevedonovan / luar

luar is a Go package for conveniently working with the luago Lua bindings. Arbitrary Go functions can be registered
MIT License
301 stars 55 forks source link

pairs/ipairs proxies crash under non-main coroutine #46

Open glycerine opened 6 years ago

glycerine commented 6 years ago

minimized reproducer, under LuaJIT 2.1.0-beta3, on darwin OSX, amd64, go1.9.1

raw luajit gi> coroutine.resume(coroutine.create(function() for i,k in pairs({hello="world"}) do end; end))
error from Lua vm.Call(0,0): '[string "coroutine.resume(coroutine.create(function() ..."]:1: bad argument #1 to 'resume' (table expected, got thread)'. supplied lua with: 'coroutine.resume(coroutine.create(function() for i,k in pairs({hello="world"}) do end; end)'
lua stack:

========== begin DumpLuaStack: top = 1
DumpLuaStack: i=1, t= 4
 String :   [string "coroutine.resume(coroutine.create(function() ..."]:1: bad argument #1 to 'resume' (table expected, got thread)
========= end of DumpLuaStack

raw luajit gi>

if I turn off the pairs proxy, no problem:

raw luajit gi> coroutine.resume(coroutine.create(function() for i,k in pairs({hello="world"}) do end; end))

elapsed: '35.82µs'
raw luajit gi>

similary, with the ipairs proxy in place, crash:

raw luajit gi> coroutine.resume(coroutine.create(function() for i,k in ipairs({[1]="world"}) do end; end))
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x104ce5828 pc=0x44b8650]

runtime stack:
runtime.throw(0x4624309, 0x2a)
    /usr/local/go/src/runtime/panic.go:605 +0x95
runtime.sigpanic()
    /usr/local/go/src/runtime/signal_unix.go:351 +0x2b8

goroutine 1 [syscall, locked to thread]:
runtime.cgocall(0x44b5a00, 0xc42005bb20, 0x42a0100)
    /usr/local/go/src/runtime/cgocall.go:132 +0xe4 fp=0xc42005bad8 sp=0xc42005ba98 pc=0x4004414
github.com/gijit/gi/vendor/github.com/glycerine/golua/lua._Cfunc_lua_pcall(0x4c80378, 0x0, 0x1, 0x0)
    github.com/gijit/gi/vendor/github.com/glycerine/golua/lua/_obj/_cgo_gotypes.go:1191 +0x4d fp=0xc42005bb20 sp=0xc42005bad8 pc=0x42a585d
github.com/gijit/gi/vendor/github.com/glycerine/golua/lua.(*State).pcall.func1(0x4c80378, 0x0, 0x1, 0x1)
    /Users/jaten/go/src/github.com/gijit/gi/vendor/github.com/glycerine/golua/lua/lua.go:176 +0x78 fp=0xc42005bb58 sp=0xc42005bb20 pc=0x42af978
github.com/gijit/gi/vendor/github.com/glycerine/golua/lua.(*State).pcall(0xc420067800, 0x0, 0x0, 0x1, 0x44b4c10)
    /Users/jaten/go/src/github.com/gijit/gi/vendor/github.com/glycerine/golua/lua/lua.go:176 +0x49 fp=0xc42005bb88 sp=0xc42005bb58 pc=0x42aae19
...

without the ipairs proxy, no crash:

raw luajit gi> coroutine.resume(coroutine.create(function() for i,k in ipairs({[1]="world"}) do end; end))

elapsed: '30.906µs'
raw luajit gi>
glycerine commented 6 years ago

Dumping the Lua stack at the top of ProxyPairs before a crash:

top of ProxyPairs, here is stack:
========== begin DumpLuaStack: top = 1
DumpLuaStack: i=1, t= 8
 Type(code 8) : no auto-print available.  This is a thread/coroutine.
========= end of DumpLuaStack

So there's no table to be iterated over on the stack, just a coroutine.

glycerine commented 6 years ago

So it looks like the thread sitting on the original Lua stack is the coroutine that ends up with the objects I want to call (Luar objects I call when I'm on a coroutine).

For instance, upon importing "fmt" with Luar, and trying to call Printf, if I look at the stack of that Lua coroutine that is sitting on the main State, i.e. where the main (original) Luar coroutine is

'========== begin DumpLuaStack: top = 1
DumpLuaStack: i=1, t= 8
 Type(code 8/ LUA_TTHREAD) : no auto-print available.
========= end of DumpLuaStack
'

If I print the stack of that thread, I see my call Printf setup:

 ... evalThread stack is:
'========== begin DumpLuaStack: top = 2
DumpLuaStack: i=2, t= 4
 String :   Printf
DumpLuaStack: i=1, t= 7
 Type(code 7/ LUA_TUSERDATA) : no auto-print available.
========= end of DumpLuaStack
'

To see this, I implemented ToThread in golua, which was missing before.

// lua_tothread
func (L *State) ToThread(index int) *State {
    s := &State{s: (*C.lua_State)(unsafe.Pointer(C.lua_tothread(L.s, C.int(index))))}
    registerGoState(s)
    return s
}

So the mystery remains simply, why isn't my Luar call being done on the coroutine that has it setup....

glycerine commented 6 years ago

Thinking further, so the Go native callbacks are registered with the main luar State. I'll try registering them with the new coroutine as well...

glycerine commented 6 years ago

hmm... though notice: the stack has "Printf" on it (above), but that is only a string; it hasn't been GetField()-ed into a function yet (and I'm assuming that the userdata is indeed a proxy for the "fmt" map, but that's just a guess).

Ambrevar commented 6 years ago

I don't have much time to work on Luar at the moment, but if you can work on a PR, I'd review it.

Also see https://github.com/aarzilli/golua: ON THREADS AND COROUTINES.

glycerine commented 6 years ago

@Ambrevar

I don't have much time to work on Luar at the moment, but if you can work on a PR, I'd review it.

Thanks, that would be great. I'll give it a go.

Also see https://github.com/aarzilli/golua: ON THREADS AND COROUTINES.

Ah, I'd missed that. Thanks for pointing it out. So its basically uncharted/"There be Dragons"/not-really tested/might-never-have-been-implemented-or-working-in-the-first-place, and this is good to know.

The two open questions in my mind, that would be helpful to have input on:

a) is a coroutine really equivalent to a whole golua.State? I kind of assumed, perhaps naively, that a golua.State would hold many corotines, since they all refer to the same _G global env table, and can refer to the same upvals/closures/nested scopes, etc.

b) where should the proxies live; if they are registered in _G on the main golua.State, I kind of assumed they would be reachable from any coroutine. Similar confusion as in (a) really. But if proxies need to be available to all new coroutines (they do), and a newly created coroutine won't necessarily call back into Go before accessing proxies, this implies that all registrations in _G need to be transfered to the new _G of the new golua.State... which hardly makes sense. Anyway, I'm just trying to nail down the conceptual model of _G and coroutines at the API level, and there's scarce info out there on this. Any input is helpful.

Ambrevar commented 6 years ago

I'm afraid I can't give you a proper answer immediately, so you might want to ask on the golua repository first. At first glance this is where the issue really lies.

stevedonovan commented 6 years ago

On Tue, Feb 27, 2018 at 4:07 PM, Jason E. Aten, Ph.D. < notifications@github.com> wrote:

a) is a coroutine really equivalent to a whole golua.State? I kind of assumed, perhaps naively, that a golua.State would hold many corotines, since they all refer to the same _G global env table, and can refer to the same upvals/closures/nested scopes, etc.

Naively we would think, but goroutines do their own kind of magic - may be backed by an OS thread, may not be. Undefined Lua behaviour results!

I think the only safe thing to do is separate State per goroutine. I once tried to get channels working but hit weirdness.

glycerine commented 6 years ago

Naively we would think, but goroutines do their own kind of magic - may be backed by an OS thread, may not be. Undefined Lua behaviour results!

I think the only safe thing to do is separate State per goroutine. I once tried to get channels working but hit weirdness.

Wait, I think we're mixing up Lua coroutines and Go goroutines. I'm talking only about Lua coroutines. I have no problem only ever accessing the Lua state from a single Go goroutine.

stevedonovan commented 6 years ago

On Wed, Feb 28, 2018 at 10:50 AM, Jason E. Aten, Ph.D. < notifications@github.com> wrote:

Wait, I think we're mixing up Lua coroutines and Go goroutines. I'm talking only about Lua coroutines. I have no problem only ever accessing the Lua state from a single Go goroutine.

Coroutines should be fine. Otherwise it's definitely a puzzle

glycerine commented 6 years ago

Following Obi Wan Kenobi's dictum, "Use the source, Luke", I have discovered

There is global state, shared by all threads

https://github.com/glycerine/lua-5.1.5/blob/master/src/lstate.h#L66

And then there is per-thread state:

https://github.com/glycerine/lua-5.1.5/blob/master/src/lstate.h#L98

But what we are typically passing around is the 2nd. This was my misunderstanding. I thought we were passing around the global state, but actually we typically pass around the main coroutine (lua thread), which is a per-thread state, in calls like vm.Pop(1). I guess this makes sense, since a thread is essentially a stack, and we're operating on the stack with most of the API.

Each per thread state has a pointer to to the global state. Its not typical to pass around the global state (_G) directly. cf

https://github.com/glycerine/lua-5.1.5/blob/master/src/lstate.h#L105

https://github.com/glycerine/lua-5.1.5/blob/master/src/lstate.h#L130

https://github.com/glycerine/lua-5.1.5/blob/master/src/lgc.c#L129