milochristiansen / lua

A Lua 5.3 VM and compiler written in Go.
zlib License
916 stars 30 forks source link

DCLua - Go Lua Compiler and VM:

This is a Lua 5.3 VM and compiler written in Go. This is intended to allow easy embedding into Go programs, with minimal fuss and bother.

I have been using this VM/compiler as the primary script host in Rubble (a scripted templating system used to generate data files for the game Dwarf Fortress) for over a year now, so they are fairly well tested. In addition to the real-world "testing" that this has received I am slowly adding proper tests based on the official Lua test suite. These tests are far from complete, but are slowly getting more so as time passes.

Most (if not all) of the API functions may cause a panic, but only if things go REALLY wrong. If a function does not state that it can panic or "raise an error" it will only do so if a critical internal assumption proves to be wrong (AKA there is a bug in the code somewhere). These errors will have a special prefix prepended onto the error message stating that this error indicates an internal VM bug. If you ever see such an error I want to know about it ASAP.

That said, if an API function can "raise an error" it can and will panic if something goes wrong. This is not a problem inside a native function (as the VM is prepared for this), but if you need to call these functions outside of code to be run by the VM you may want to use Protect or Recover to properly catch these errors.

The VM itself does not provide any Lua functions, the standard library is provided entirely by other packages. This means that the standard library never does anything that your own code cannot do (there is no "private API" that is used by the standard library).

Anything to do with the OS or file IO is not provided. Such things do not belong in the core libraries of an embedded scripting language (do you really want scripts to be able to read and write random files without restriction?).

All functions (including most of the internal functions) are documented to one degree or another, most quite well. The API is designed to be easy to use, and everything was added because I needed it. There are no "bloat" functions added because I thought they could be useful.

Note that another version of this exists over at ofunc/lua. That version has some interesting changes/features, I suggest you give it look to see if it suits your needs better.

Loading Code:

This VM fully supports binary chunks, so if you want to precompile your script it is possible. To precompile a script for use with this VM you can either build a copy of luac (the reference Lua compiler) or use any other third party Lua complier provided that it generates code compatible with the reference compiler. There is no separate compiler binary that you can build, but it wouldn't be hard to write one. Note that the VM does not handle certain instructions in pairs like the reference Lua VM does, and I don't remember if I made the compiler take advantage of this or not. If I did then binaries generated by my compiler may not work with the reference VM.

If you want to use a third-party compiler it will need to produce binaries with the following settings:

When building the reference compiler on most systems these settings should be the default.

The VM API has a function that wraps luac to load code, but the way it does this may or may not fit your needs. To use this wrapper you will need to have luac on your path or otherwise placed so the VM can find it. See the documentation for State.LoadTextExternal for more information. Keep in mind that due to limitations in Go and luac, this function is not reentrant! If you need concurrency support it would be better to use State.LoadBinary and write your own wrapper.

The default compiler provided by this library does not support constant folding, and some special instructions are not used at all (instead preferring simpler sequences of other instructions). Expressions use a simple "recursive" code generation style, meaning that it wastes registers like crazy in some (rare) cases.

One of the biggest code quality offenders is or and and, as they can result in sequences like this one:

[4]   LT        A:1  B:r(0)   C:k(2)  ; CK:5
[5]   JMP       A:0  SBX:1            ; to:7
[6]   LOADBOOL  A:2  B:1      C:1
[7]   LOADBOOL  A:2  B:0      C:0
[8]   TEST      A:2           C:1
[9]   JMP       A:0  SBX:7            ; to:17
[10]  EQ        A:1  B:r(1)   C:k(3)  ; CK:<nil>
... (7 more instructions to implement next part of condition)

As you can see this is terrible. That sequence would be better written as:

[4]   LT        A:1  B:r(0)   C:k(2)  ; CK:5
[5]   JMP       A:0  SBX:2            ; to:8
[6]   EQ        A:1  B:r(1)   C:k(3)  ; CK:<nil>
... (1 more instruction to implement next part of condition)

But the current expression compiler is not smart enough to do it that way. Luckily this is the worst offender, most things produce code that is very close or identical to what luac produces. Note that the reason why this code is so bad is entirely because the expression used or (and the implementation of and and or is very bad).

To my knowledge there is only one case where my compiler does a better job than luac, namely when compiling loops or conditionals with constant conditions, impossible conditions are elided (so if you say while false do x(y z) end the compiler will do nothing). AFAIK there is no way to jump into such blocks anyway, so eliding them should have no effect on the correctness of the program.

The compiler provides an implementation of a continue keyword, but the keyword definition in the lexer is commented out. If you want continue all you need to do is uncomment the indicated line (near the top of ast/lexer.go). There is also a flag in the VM that should make tables use 0 based indexing. This feature has received minimal testing, so it probably doesn't work properly. If you want to try 0 based indexing just set the variable TableIndexOffset to 0. Note that TableIndexOffset is strictly a VM setting, the standard modules do not respect this setting (for example the table module and ipairs will still insist on using 1 as the first index).

Missing Stuff:

The following standard functions/variables are not available:

The following standard modules are not available:

Coroutine support is not available. I can implement something based on goroutines fairly easily, but I will only do so if someone actually needs it and/or if I get really bored...


In addition to the stuff that is not available at all the following functions are not implemented exactly as the Lua 5.3 specification requires:

Finally there are a few things that are implemented exactly as the Lua 5.3 specification requires, where the reference Lua implementation does not follow the specification exactly:

The following core language features are not supported:

TODO:

Stuff that should be done sometime. Feel free to help out :)

The list is (roughly) in priority order.

Changes:

A note on versions:

For this project I more-or-less follow semantic versioning, so I try to maintain backwards compatibility across point releases. That said I feel free to break minor things in the name of bugfixes. Read the changelog before upgrading!


1.1.8

1.1.7

1.1.6

Fun with tables! Ok, not so much fun.

1.1.5

And, another stupid little bug.

1.1.4

Not sure how I missed this one... Oh well, it should work now.

1.1.3

One of the tests was failing on 32 bit systems, now it isn't.

1.1.2

More script tests, but no real compiler bugs this time. Instead I found several minor issues with a few of the API functions and a few other miscellaneous VM issues (mostly related to metatables).

This version also adds a minor new feature, nothing to get excited about... Basically I made it so that JSON or XML encoding an AST produces slightly more readable results for operator expression nodes. Someone else suggested the idea (actually they submitted a patch, yay them!). I never would have thought to do this myself (never needed it), but now that I have it, it seems like it could be useful for debugging the compiler among other things.

Unfortunately due to the way the AST and most encodings work, it is impossible to unmarshal the AST. I am not 100% sure if it is possible with XML or not, but it certainly will not work with JSON. This could maybe be fixed, but would be way too much work.

Anyway, these improvements are still useful if you want to examine the AST for whatever reason...

1.1.1

More script tests, more compiler bugs fixed. Same song, different verse.

1.1.0

I was a little bored recently, so I threw together a generic metatable API. It was a quick little project, based on earlier work for one of my many toy languages. This new API is kinda cool, but it in no way replaces proper metatables! Basically it is intended for quick projects and temporarily exposing data to scripts. It was fun to write, and so even if no one uses it, it has served its purpose :P

I really should have been working on more script tests, but this was more fun... I have no doubt responsibility will reassert itself soon.

Anyway, I also added two new convenience methods for table iteration, as well as some minor changes to the old one (you can still use it, but it is now a thin wrapper over one of the new functions, so you shouldn't).

1.0.2

More tests, more (compiler) bugs fixed. Damn compiler will be the death of me yet...

In addition to the inevitable compiler bugs I also fixed the way the VM handles upvalues. Before I was giving each closure its own copy of each upvalue, so multiple closures never properly shared values. This change fixes several subtle (and several not so subtle) bugs.

Oh, and pcall works now (it didn't work at all before. Sorry, I never used it).

1.0.1

This version adds a bunch of tests (still not nearly as many as I would like), and fixes a ton of minor compiler errors. Most of the compiler errors were simple oversights, usually syntax constructs that I never used in my own code (and hence never tested).

The VM itself seems to be mostly bug free, but the compiler is a different story. I'm fixing bugs as fast as I discover them, but sometimes it's really tempting to just use luac and call it a day :P