google / schism

A self-hosting Scheme to WebAssembly compiler
Apache License 2.0
1.27k stars 65 forks source link

Factor primitives into external libraries #90

Closed eholk closed 5 years ago

eholk commented 5 years ago

This splits the big list of primitive functions into their proper libraries. The majority of them go into (rnrs). The pair mutator functions go into (rnrs mutable-pairs). We had a handful of non-standard functions that go into (schism), which is somewhat analogous to the (chezscheme) library that Schism used to pull a few functions from.

Issue #19

eholk commented 5 years ago

Unfortunately, this breaks bootstrap-from-guile.sh in its current form. I'd like to fix that before landing this, but I'm not sure how. The first problem was that Guile was trying to read schism.ss from the root directory, rather than from lib/. That's easy enough to fix; we can just remove the file, since we don't use it anymore.

After that, I tried adding (add-to-load-path "./lib"). This gets a little further, but then it gets confused by rnrs.ss. Is there a way to say "while compiling (schism compiler), using our rnrs instead of Guile's?

wingo commented 5 years ago

Hi!

Question: what's the ultimate vision around modules? Do they get compiled to separate .wasm files and get delivered separately? Or is the vision to generally do whole-program compilation? Do we have any possibility for cross-module inlining? If we do the whole-program thing it should come out naturally with a cp0 pass.

Why does the stage0 become so much smaller in this commit? This is a good thing but i would like to know why :)

Regarding the Guile bootstrap, I think what we actually want is for Guile to use its own rnrs.ss, right? In that case add-to-load-path doesn't do the right thing, as it adds the path to the front, whereas in that case we should add it to the back. I'll have a look.

eholk commented 5 years ago

Question: what's the ultimate vision around modules? Do they get compiled to separate .wasm files and get delivered separately? Or is the vision to generally do whole-program compilation? Do we have any possibility for cross-module inlining? If we do the whole-program thing it should come out naturally with a cp0 pass.

Ideally, I'd like to see some combination of all of the above. It'd be nice if we could build a libschism.wasm file that could go up on a CDN and any Schism users could just pull from there. Then, for their own local code, they wouldn't have to ship the whole library, just their own code. It'd be nice to be able to bundle multiple modules together though, so we don't have to have one each for librnrs.wasm, librnrs-mutable-pairs.wasm, etc.

I really don't want to give up the ability to do cross-module inlining. For separate .wasm files, as far as I can tell, this means packing a serialized AST into the module as well. Maybe we'd have a size threshold and only include bodies for functions where inlining is likely to make sense.

So, long term what I imagine is we can load already-compiled modules from .wasm files, but these include serialized ASTs, so code generated by Schism will have a lot of aspects of whole-program compilation.

One reason I'd like to support loading from separate .wasm files is that in the meantime, I see it as a path towards eval. I'd like to reuse the compiler to implement eval, rather than adding an interpreter. At the moment, making this work means generating a new .wasm buffer, calling out to the runtime to have it compile and link the module, and then stick an entry in a function table that it returns back to Schism. Things might change if Wasm gets JIT instructions though, or if Wasm gets anyfunc.

Why does the stage0 become so much smaller in this commit? This is a good thing but i would like to know why :)

My guess is that it's because we no longer have a big quoted list of all the primitives, since these are loaded from a file instead.

Regarding the Guile bootstrap, I think what we actually want is for Guile to use its own rnrs.ss, right? In that case add-to-load-path doesn't do the right thing, as it adds the path to the front, whereas in that case we should add it to the back. I'll have a look.

Good point. I realized this morning that a real rnrs includes definitions for things like define, which are still magical in Schism. Maybe the right thing to do is split up the libraries into Scheme libraries and Schism-specific libraries? Schism would know to load from both, but when bootstrapping from Guile, we'd only add the Schism-specific libraries to the load path.

eholk commented 5 years ago

Unfortunately, this probably breaks the playground too, since Schism no longer contains the libraries programs need, but has to load these from a file. We can solve this with another file system interface though.

wingo commented 5 years ago

I guess I'm a little hesitant about moving away from the nice property Schism has now of this two-file deliverable. I want to go a little deeper on the multi-module strategy -- I think if you conceive of Schism as being this interactive development environment where users add code at run-time, then I can see how this makes multiple modules linked with imports attractive. But if you conceive of Schism as being a project that you essentially write and compile ahead-of-time then ship the user a single file, then a much easier route to multi-module compilation is whole-program compilation. It would also likely produce smaller, faster deliverables, and you could inline and simplify much more aggressively.

I dunno, I think we are climbing a local maximum here on the single-binary self-hosted side. Multiple modules are part of that hill, but probably not multiple .wasm files. Then there's another local maximum, being a full-blown Scheme development environment with dynamic linking. Does that match how you perceive it too? Are you looking to switch more towards the latter?

eholk commented 5 years ago

Thanks for asking these questions to help clarify the vision for Schism. It'd probably be worth collecting these into a document somewhere, rather than having the discussions spread across issues and PRs.

Just to make sure, the two files you talked about are the JS runtime and the .wasm file generated by Schism, right? In other words, the compiled output for a program someone wrote, and not necessarily the distribution of the Schism compiler itself, right?

Maybe one way to look at this is what do we see as the use case for Schism? I can think of two obvious ways people might use it. The first is for people who want to create web apps in Scheme and distribute them. The second is to have some kind of web-based IDE, that's maybe more educationally focused (sort of like a Dr Racket on the web). I think these are roughly the two cases you suggested too.

I think I'm much more likely to use Schism as a way to create compiled-ahead web apps than as a general IDE. I also think we're much closer to that use case than a full IDE too. I also don't see any reason one couldn't create a Schism IDE as a compiled-ahead app using Schism.

One point in favor of dynamic loading is that I don't know that Wasm has a great story here yet. It seems like there are all the pieces you'd need to make this work, but I'm not sure to what extent anyone has done this. Schism has an opportunity to be a trailblazer here.

Anyway, I think it makes sense to keep climbing the whole program compilation hill. At some point, I'd like to support eval, preferably through compilation rather than by building an interpreter. I think this is going to necessitate some form of dynamic code loading, but hopefully we can do it in a way that's complementary to whole program compilation.

eholk commented 5 years ago

I finally found some time to work on this again. I split the libraries into lib/ and scheme-lib/, where scheme-lib/ is only used by Schism. This makes bootstrap-from-guile.sh work again, and it produces the same output bootstrapping from Schism.

I'd still like to fix the playground before merging this though, but hopefully this evening or tomorrow I can get that working again. I think it just needs a web implementation of Schism's filesystem API.

eholk commented 5 years ago

The playground works in Chrome now if you give Chrome the argument --js-flags="--experimental-wasm-anyref experimental-wasasm-return_call".

I think this PR is at a point where I'd feel comfortable merging it. @wingo - what do you think?