An idea for building and delivering $lang programs

madmalik commented 6 years ago

I assume for this writeup that we've a viable strategy to compile $lang to Rust and that it yields a sizeable performance boost. Both assumptions have to be evaluated, but before that we should be sure we actually want that. ;)

I also assume the following compilation steps: $lang -> bytecode and bytecode -> Rust.

The idea is to piggyback completly on rust infrastructure. In the simplest case, a $lang program is a rust project with a build.rs that does either of two thinks: a.) It creates a main.rs thats a program that, wenn run, opens the actual project (maybe have a $lang-src folder in the rust project), compiles it into bytecode and then runs the bytecode. This would be the normal development workflow. The interpreter has to be compiled once and all subjequent cargo run commands just execute our $lang sources. b.) It opens and compiles the project into bytecode and then compiles the bytecode into a main.rs. That would be the release mode and the artifact that can be deployed is a pure rust program. It may be necessary that this binary also contain the full interpreter if runtime code execution or other dynamic features are desired.

Libraries are either written in $lang or Rust and are distributed by cargo. From the lib-user perspective both look and behave the same (since the lib code is compiled to Rust). That means that every time a library is added the development-interpreter has to be rebuild.

In essence: A standalone $lang project is not just the $lang source files, but also contains the complete description to build the runtime plus all libraries and it can create a single, self contained artefact that can be easily deployed.

Why is that useful?

Just delivering a few source files and let the end user sort out the runtime (including all nessesary libs) is not really an option. So a bundle has to be created. If we use libs written in Rust this bundle will contain plattform specific binary blobs anyway, so why not just throw the rest of the runtime into it? (thats whats many projects that are written in a dynamic language end up doing anyway)

It also reuses the excellent cargo infrastructure and makes it more natural to gradually migrate to Rust proper.

Drawbacks:

To make this strategy viable the the runtime has to be quite small so that the cost of delivering it with every app is manageble.
Potentially exposes a lot of implementation details to the $lang user
bind our language very closely to rust, which may or may not be desirable

There are two use-cases that i haven't written about:

Single file scripts where we don't want to create a whole project. There we would cargo install a global interpreter with all nessesary libraries.
$lang embedded into a rust program. I don't see inherent problems using the interpreter directly, but the mechanics of when and how to compile to rust would need some thought.

I'd be nice if you poke some holes into this idea.

pliniker commented 6 years ago

I see what you're saying and why it sounds attractive. Maybe carefully designed bytecode instructions coupled with something like https://github.com/Manishearth/rust-gc to manage object lifetimes might make it feasible to translate to a limited subset of Rust. The bytecode would essentially be another layer of intermediate representation.

What kind of language do you think would benefit from this approach?

madmalik commented 6 years ago

What kind of language do you think would benefit from this approach?

From the language perspective: Languages that are on the static side in the dynamic language spectrum might be more conductive to this approach. For example python might be problematic because everything is monkey patchable. Personally i'd like that because i think excessive (and accidentally introduced) dynamism is not necessarily a good thing. On the other hand it'd be a shame if we loose out on simplicity and flexibility (in the end, that is the whole point of introducing a dynamic language to work with rust) just to favour an implementation strategy.

From the use-case perspective: My background is python web development, and i totally understand people saying that switching to go is worth it alone for the easy deployment (just push a self contained binary to the server - done). I know application developers in several "hosted" languages that switched to delivering their own runtime with their apps because the headache of requiring users to have the right java, python or whatever version installed. Also i don't think we can expect any OS-distribution to have a $lang interpreter written in rust preinstalled in the future. ;)

For me it's a lot less clear if the use case "embedded language" for scripting of plugins or game logic or whatever is served by this approach. A small interpreter with optional JIT (when binary size allows it) could be a better solution for that, i don't know.

pliniker commented 6 years ago

Transpiling Python to Go is a thing, of course.

FWIW, at present my thoughts are that given the choices of interpreting, JIT compiling and AOT compiling, the lowest maintenance options are interpreting and AOT since there isn't a broadly maintained, compatibly-licensed JIT library around.

However, I'm not convinced that transpiling to Rust makes the most sense at the outset? I understand compiling to C or Go, but Rust is opinionated. If a subset of Rust, with library support perhaps, could be shown to map reasonably from another language, and compile times improve...

cipriancraciun commented 6 years ago

I have also thought about this topic of how one can transform a "script" (or "project" including dependencies) written $lang into an actual OS executable, without any external dependencies.

However for the initial versions, a third approach much similar to (b), might be suitable: instead of embedding in main.rs the bytecode version of the script, one could use include_str! and include the actual source code of the "script" (and dependencies). Therefore this approach is extremely simple to be implemented, as instead of loading the source code from the filesystem, one just takes it out of a HashMap or similar.

For example I employ a similar technique for testing my interpreter as seen in the following snippet: https://github.com/cipriancraciun/vonuvoli-scheme/blob/master/tests/scheme.rs

#[ cfg ( feature = "vonuvoli_builtins_regex" ) ]
def_scheme_tests_from_file! (   
    test__regex_strings => "scheme/regex-strings.sst",
    test__regex_bytes => "scheme/regex-bytes.sst",
);

Moreover it needs zero extensions for cargo (not even build.rs) and should work just with a plain rustc manual command.

However, as @madmalik observed in his drawbacks section, delivering such "scripts" or "projects" written in $lang as actual Rust libraries, via cargo, might seem overwhelming for the casual user.

Think about the following use-case:

one discovers $lang and wants to write some simple scripts;
one downloads $lang-release-v99.zip, extracts the files, and places them somewhere;
one writes the script and tests in with $lang-interpreter ./script script-arguments;
one now wants to deploy his script and the required runtime as a single file o a remote server, thus one just $lang-compiler --in ./script --out ./executable;

How compare the simple procedure described above, to a "recipe" which says:

first install rust-up and fetch nightly and configure it to be used by default;
now cargo new and manually configure Cargo.toml and in the [dependencies] section add $lang-runtime = "v99";
now inside the lang-src folder create your script file;
now run cargo run -- script-arguments; (emphasis on -- or else cargo run would use those flags...)
want to write a separate script, unrelated to the previous one? repeat from step 1;

To conclude a useful outcome from this WG would be a tutorial on how to be able to write $lang-compiler, which without having a Rust deployment, (but perhaps embedding somehow the Rust compiler), might take a "script" and compile it into an executable. I.e. basically hiding all Rust-related details behind the curtains...

cipriancraciun commented 6 years ago

@pliniker Regarding transpiling $lang to Rust, I think there could be a potential use-case even for dynamic languages, as the Rust compiler (and the LLVM infrastructure) might at least optimize loops and function calls; moreover if the $lang code is "optimized" a little (via transformations) and function calls are clearly identified, Rust's inlining might at least eliminate some jumps.

madmalik commented 6 years ago

I realise that I conflated two issues needlessly. One is the question of optional AOT compilation (in whatever form) and the other is the reuse of rust dependency management and build infrastructure.

Maybe the question of dependency management is too early and should be done when the language itself is done, idk really...

rust-hosted-langs / runtimes-WG

An idea for building and delivering $lang programs #4