Implement `__concat`, `table.concat`, and add Lua polyfills for `table.sort` and `table.move`

kyren / piccolo

An experimental stackless Lua VM implemented in pure Rust

Creative Commons Zero v1.0 Universal

1.62k stars 59 forks source link

Implement `__concat`, `table.concat`, and add Lua polyfills for `table.sort` and `table.move` #88

Closed Aeledfyr closed 1 month ago

Aeledfyr commented 2 months ago

At some point it would be good to have table.move and table.sort implemented in Rust, but for now a Lua implementation works.

This PR also implements the __concat metamethod, and uses it to implement table.concat; unlike PRLua's implementation, this version supports tables with the __concat metamethod.

For example, table.concat({ a, b, c }, sep) is equivalent to a .. sep .. b .. sep .. c, and respects the right-associativity of the .. operator.

Semantics questions on table.concat:

Should table.concat({ x }) == x, regardless of type? It currently does, as it never tries to call the __concat metaoperator.
Is the complexity of this worth it over only supporting string/integers in table.concat?

kyren commented 1 month ago

This is really good, thank you!

Should table.concat({ x }) == x, regardless of type? It currently does, as it never tries to call the __concat metaoperator.

I think so, I don't think that the __concat metamethod should be called with a single argument definitely, and it would be weirder to call something like __tostring. I think it makes sense because it makes table.concat({ table.concat({ x }), y}) the same as table.concat({x, y}).

Is the complexity of this worth it over only supporting string/integers in table.concat?

This I'm not as sure about, but it makes complete sense to me to try to use the same implementation of the concat metaop for table.concat

kyren commented 1 month ago

We're accumulating a lot of stdlib functions which sort of have unbounded time usage, but that unbounded time usage is proportional to some allocated memory (number of arguments on the stack, size of a table, etc), or also don't use fuel proportional to the input size.

I'm not fussed about solving this now, especially varargs because I think there might need to be a more centralized solution that limits varargs to something reasonable (I would be fine with a lower limit than PUC-Rio's limit even, which is something silly like a million). I'm mostly pointing this out to LET IT BE KNOWN that I am aware of it, and that it is a temporary problem.

I would say that a good (tentative) plan would be to

1) Limit varargs to 64k across the board. Pushing / popping stack frames always takes time proportional to the amount of stack manipulation happening, so whatever this limit is is implicitly a lower bound on the maximum amount of time that an Executor::step call can take. Stack manip like this is always a memcpy / memmove (thanks to Value being Copy!), so this is incredibly fast, but there should be SOME reasonable universal limit. Function calls / returns alread use fuel proportional to the number of arguments, we just need to define a maximum. 2) Go through every stdlib function that deals with tables / strings and make sure that they process tables in blocks of max 64k (or lower), and also make sure that they use fuel proportional to the number of elements processed.

Aeledfyr commented 1 month ago

What would be the current best approach to fuel consumption for concat? Consume fuel proportional to the output size at each step? (This runs into the issue of not being able to efficiently break up large allocations, though -- would need to repeatedly consume fuel / yield while doing nothing, and then once enough fuel has been consumed, make the full large allocation.)

kyren commented 1 month ago

I'm sorry it took me so long to merge this, I've had a rough week so I've been a little out of it.

I ended up moving the support code for async sequences used by meta ops into meta_ops.rs and making it private. I know this is the opposite of what we discussed before, but I didn't realize it was something ONLY used by meta_ops.rs, so this made more sense to me. This was the only meaningful feedback I had after sitting on it for a week (and it was a tiny change) so I just went ahead and merged the PR rather than bother you about it.

Thank you for all of your hard work. With this, every metamethod is implemented other than __gc!

Aeledfyr commented 2 weeks ago

Thank you! There's no problem with the wait on the merge (after all, I'm responding with a significantly longer delay; life happens).

On the async support code: I agree, but I think it would still be worth making something like it available in the future (potentially in piccolo_util), since it helps a lot with the verbosity of async sequences. It would need more planning to determine which helpers/API would work best, and at this point I don't have enough usage examples to come up with a good design.