Rembulan is currently only able to load text chunks. It might be a good idea to consider adding support for binary chunks.
Java bytecode (as generated by Rembulan), even packed in a JAR file, is probably too low-level for this purpose. Classes in Java bytecode are named, whereas function prototypes in the binary chunk should be (mostly) anonymous. Additionally, there is no way of controlling the origin of the JAR file (and hence its contents): most users would not want to load arbitrary bytecode if they expect a pre-compiled Lua function.
Therefore, these binary chunks should probably be based on the compiler's already-existing IR (intermediate representation).
Advantages
Loading binary chunks would eliminate the time spent in parsing, performing static analysis, and optimising the code.
string.dump could be implemented.
Function prototypes could be transmitted over the wire.
This feature would help in serialisation (#4). (Upvalues need to be serialised as well, and at least in PUC-Lua, these are not included in a binary chunk -- an expectation that should probably not be broken in Rembulan.)
Disadvantages
This is strictly speaking not necessary to implement the Lua programming language. (The Lua bytecode is an implementation detail of PUC-Lua.)
The implementation would introduce additional complexity and source of bugs into the compiler.
The IR should not be considered stable at this point: there are still outstanding optimisations that may require changing the IR. Fixing the IR would slow down progress on that front.
Rembulan's IR is very different from the PUC-Lua bytecode, and its binary format would be too. (In other words, it would still not be possible to load a chunk compiled by PUC-Lua in Rembulan.)
Security and safety implications: a malformed binary chunk may crash the compiler. To do this right, this feature would probably require a verifier. (On the other hand, the IR does have some of the necessary checks in place already, and is immutable. The compiler could be made resilient to malformed IR input.)
Binary chunk loading is typically not permitted in sandboxed environments, so the usefulness in the niche Rembulan is explicitly targeting is limited.
How could this be done
The following steps are required in order to implement this feature such that it is usable/useful:
Define a binary format for the IR, and implement its writer and reader.
(Optionally, implement a verifier for security.)
Add an additional entry point into the compiler pipeline that accepts loaded IR.
Attach the binary representation to compiled functions in a way that is accessible at runtime. (The best way to go about this is probably using Java annotations.)
Rembulan is currently only able to load text chunks. It might be a good idea to consider adding support for binary chunks.
Java bytecode (as generated by Rembulan), even packed in a JAR file, is probably too low-level for this purpose. Classes in Java bytecode are named, whereas function prototypes in the binary chunk should be (mostly) anonymous. Additionally, there is no way of controlling the origin of the JAR file (and hence its contents): most users would not want to load arbitrary bytecode if they expect a pre-compiled Lua function.
Therefore, these binary chunks should probably be based on the compiler's already-existing IR (intermediate representation).
Advantages
string.dump
could be implemented.Disadvantages
How could this be done
The following steps are required in order to implement this feature such that it is usable/useful:
string.dump
.