kyren / piccolo

An experimental stackless Lua VM implemented in pure Rust
Creative Commons Zero v1.0 Universal
1.62k stars 59 forks source link

Mapping a Lua function over a Rust Iterator #56

Open jrobsonchase opened 4 months ago

jrobsonchase commented 4 months ago

I suspect I'm running up against the downside of the "stackless" style here.

In Rust, I have a reference to some World that I'm wrapping in a piccolo_util::Frozen and sticking in a global Lua variable to be accessed from Lua. This works great for World methods that return static things. When doing more complex queries on the world though, I end up with an iterator which contains a reference to the World, which thanks to Frozen, is only valid within the call to Frozen::with.

My first approach was to pass a Function from Lua to Rust which takes the iterator's Item, and create an Executor to run it inside the callback. Then I found this blurb in the Executor docs:

Executors are not meant to be used from callbacks at all, and Executors should not be nested. Instead, use the normal mechanisms for callbacks to call Lua code so that it is run on the same executor calling the callback.

I'm guessing these "normal mechanisms" are CallbackReturns like Call and Sequence? Since neither the World reference nor the iterator can escape the Frozen::with callback, it doesn't seem like I'd be able to wrap and return the iterator directly, and it can't be recreated and resumed from an arbitrary point. And even if that were possible, it would be handing out references which themselves couldn't be Function::bind'ed and escape.

It seems like the best approach available to me is to copy/collect the iteration results into a Table and return that back to Lua, but I'd like to avoid that overhead if possible.

Is there anything I've missed? Any other approaches to consider?

kyren commented 4 months ago

This is kind of a long response and I'm probably telling you things you already know. This is not only to respond to you but also me laying out the situation around the "stackless" design for anyone else reading this. Sorry for the lengthy response!

You're correct about the blurb, the whole system is really not designed to nest Executors, and you can panic the interpreter that way (through cross-thread upvalues).

Also, look at it this way, if you ran Lua code in a nested Executor, the system would no longer have the main property that piccolo is designed for: that (the outermost) Executor::step runs for a bounded amount of time. An out of control World query would hang the script for an arbitrary amount of time (depending ofc on how bad you can make a single World query).

Not everybody cares about this ofc, and I get that, but that's the niche that I've decided to fit piccolo into. Besides, not all hope is lost, and I think there are still good and not too difficult strategies for almost all use cases.

In your case, one possibility you already figured out, which is to collect the query into a big Table and call it a day. Honestly, even for cases where users cared deeply about sandboxing, this can be a fine approach. You can consume execution fuel proportional to the query size and set sane max query limits, and this could be a reasonable and fast approach to passing a large amount of data to Lua, especially with the current (this will get better over time but it's not there yet) overhead of Lua -> Rust callback calls. Since the query is one call, and Lua accessing a Table is a builtin operation, one function call and a malloc to fit 1,000 results in a Table can be faster than 1,000 callback calls, right?

Another approach you may not have considered is to follow the Lua iterator protocol. You return an iterator function and any arbitrary control value, and the iterator function is called to update the control value until the control value becomes nil. This would allow Lua to use generic for loops to iterate over the results. This is very similar to the approach you already stated with storing an iterator, but like you said.. you do have to deal with the lifetimes of an iterator, and may need to use something other than a plain Rust iterator to store the Lua iterator state. You would probably have run into a similar problem no matter the approach, because passing references to items of that Rust Iterator still has the same lifetime problems, though you could theoretically have gotten around that with the Frozen system from piccolo_util.

You can still take your original desired approach of calling Lua functions for each element of the query result, but what you can't do is get around the lifetime limitations of this. Using a nested Executor like system, if a script put while true do end in its query callback, that would block the outer interpreter from ever returning. If the interpreter was somehow able to suspend this nested Executor setup, then this would violate the lifetime of the iterator by returning control to Rust with a live iterator unknown to the rust borrow checker.

The system you're wanting only works if the only way to exit out of / suspend the inner loop of Lua is through unwinding through the Rust stack frame holding the iterator, and yeah.. that goes against the way piccolo is currently designed. This problem is not unique to piccolo either, the same problem shows up with Rust futures, because a future may suspend execution, you have to make sure that the data that a future refers to can live across an await suspend point, and this is the exact same situation.

I think the best advice I can give right now is to try as little as possible to fight with Lua about data with non-'static Rust lifetimes. Lua is not designed to deal with non-'static data, you can smuggle references to anything to almost anywhere from almost anywhere, it's just how the language works. This means that most of the time, you want to make the data you pass to Lua 'static and just live with this limitation. In your case of a World query, you could make the case that returning a collected query as a Lua iterator is maybe a potential abuse vector, since the script can make large queries and then never actually iterate them, holding arbitrary amounts of memory. You can either record this memory through gc-arena and add memory limits, OR do your original approach of calling Lua callback functions for each query result. However, you still need to make the produced Sequence hold the query results as a 'static value so that Executor::step can still suspend execution periodically. This means that concurrently running Rust code could, for example, go change the running state of the World during a query and you need to handle that somehow, but with this approach you can at least more sanely forbid Lua from executing multiple queries at one time.

I know this is all very very complicated, and what we need is a library of combinators to handle each of these common situations so the user can just choose which tradeoff they want. I want to push back though on the idea that these problems are unique to piccolo or that it makes piccolo completely non-viable, these limitations are the same that Rust futures have, that you cannot hold arbitrary references across await points (which in piccolo are disappointingly currently expressed as Sequence returns, but I want to change this one day to be actual futures / coroutines once the language has enough features to support it).

piccolo fills a very particular need. Running Lua scripts every tick of a game loop can be a problem because Lua is slow, scripts can be slow etc. piccolo's niche is to fix this by allowing Lua scripts to run concurrently with the game loop (and with each other) via polling like Rust futures, but ofc this means they have the same limitations as Rust futures. I still think this is a very good idea, and you almost always run into similar problems anyway becuase Lua is just not designed to handle non-'static data in the first place, but it's not an idea without tradeoffs.

jrobsonchase commented 4 months ago

Thanks for the reply! I actually dug through discord shortly after posting and found more or less this exact spiel from you to more or less the exact question :grin: Anyway, agreed that it's good to have it here as well for visibility :+1:

In your case, one possibility you already figured out, which is to collect the query into a big Table and call it a day. ... Since the query is one call, and Lua accessing a Table is a builtin operation, one function call and a malloc to fit 1,000 results in a Table can be faster than 1,000 callback calls, right?

Yep, this is precisely what I'm doing now! And that's a very good point regarding the callback vs table access overhead that I hadn't considered, but makes a lot of sense now that you mention it.

Another approach you may not have considered is to follow the Lua iterator protocol. You return an iterator function and any arbitrary control value, and the iterator function is called to update the control value until the control value becomes nil. This would allow Lua to use generic for loops to iterate over the results.

I considered this as well, but as you mentioned - lifetimes are still problematic. Once I switched over to collecting into a Table, I figured "why reinvent the wheel?" and now simply return a CallbackReturn::Call { function: ctx.get_global("ipairs").bind(&ctx, query_result), .. }. I might even skip that and return the table directly and let the caller decide what to do with it :shrug:

One other vague idea I considered trying out would be to yield some sort of "Please create this query" result all the way up to Rust, create and freeze the query, and then resume Lua execution with that. It would invalidate the World reference as long as the frozen query is live though, would probably be fiddly to implement, and I think yielding the main thread isn't possible in the first place? So yeah, "collect into a Table" still seems like the best set of tradeoffs. All of this is probably premature optimization on my part anyway :grinning:

kyren commented 4 months ago

I actually dug through discord shortly after posting and found more or less this exact spiel from you to more or less the exact question.

I am truly sorry that you had to listen to it twice 😆

I think yielding the main thread isn't possible in the first place?

Absolutely it's possible, I didn't even bother to implement the "main thread" distinction that PUC-Rio Lua has with yieldable / un-yieldable threads.

1) Actually yield and treat the main thread as a Lua coroutine in the normal way. The executor will have results you can take and then move to the "Suspended" state where it can be resumed. 2) (This is probably the better option) You can "yield" in a way invisible to Lua and call Fuel::interrupt() which transfers control immediately to the Rust which called Executor::step. If you wanted to do a like.. "active query" system where you had to transfer control out of Lua, then execute the query in Rust, then set up something with Frozen to pass non-'static results back into Lua, that'd be what I'd do I think? That might actually work well?