google / schism

A self-hosting Scheme to WebAssembly compiler
Apache License 2.0
1.27k stars 65 forks source link

Functions some times return pointers, some times values #33

Open matthewp opened 6 years ago

matthewp commented 6 years ago

I might just be misunderstanding how this is supported to work. If I have a function like:

(library
  (trivial)
 (export add addi)
 (import (rnrs))

 (define (addi)
  (+ 10 20))

 (define (add a b)
  (+ a b)))

addi will return a pointer that I can get the value from with engine.jsFromScheme(ptr). However add returns the JavaScript value when called like instance.exports.add(1,2).

I admit to not understanding how arguments work yet but this seems like a bug. I would expect the API to always return pointers.

eholk commented 6 years ago

Everything in Schism should be a ptr. These are basically tagged values. Numbers carry the tag 0 in their low three bits, and the higher bits are the actual numbers. For pairs, they have a tag and then the higher bits are a pointer into linear memory where you can find two ptrs representing the car and cdr of the pair.

So what should happen is that instead of calling instance.exports.add(1,2), you should call instance.exports.add(schemeFromJs(1), schemeFromJs(2)), except that schemeFromJs doesn't exist yet.

The reason your example is working is basically that the + operation doesn't do any type checking. Numbers have the tag 0 so they can be added without any wrapping and unwrapping. So, your add function just blindly adds its two values together. The raw value 1 is the constant #f (it has the tag 1 for "constant" and the value 0 for "#f", see the tag definitions), and the raw value 2 is a pair at location 0 in linear memory (this is actually where the allocation pointer and symbol table live). Add should complain that you are trying to add #f to a pair, but instead it just does the addition, creating the null character (tag 3 and value 0).

So I think there are two bugs here. One is that we need a way to convert integers from JavaScript into tagged integers that Schism needs. The second is that + should verify that it's actually adding two numbers together.

matthewp commented 6 years ago

Perfect 👍 Thanks for the explanation. So in addition to those two bugs there's also the problem of user error on my part; i didn't now how to call the functions properly.

I think it would be good to have a way in the runtime to create a wrapper for an instance (maybe in Module?) so you can call into schism with JS values and get JS values back out. Maybe I'll create a new issue to discuss what that should look like.

eholk commented 6 years ago

Yes! I would love to have this. It'd be great to automatically be able to wrap all the schism functions with conversions so they could be used just like JS functions.

On Thu, Jun 28, 2018, 11:27 AM Matthew Phillips notifications@github.com wrote:

Perfect 👍 Thanks for the explanation. So in addition to those two bugs there's also the problem of user error on my part; i didn't now how to call the exports properly.

I think it would be good to have a way in the runtime to create a wrapper for an instance (maybe in Module?) so you can call into schism with JS values and get JS values back out. Maybe I'll create a new issue to discuss what that should look like.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google/schism/issues/33#issuecomment-401129476, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGdJlhEYxMX6cm4YOogLdbXmcLFcwtWks5uBR_1gaJpZM4U7vjG .

matthewp commented 6 years ago

First I might try and implement schemeFromJs to get a better understanding for how parameters are handled and just how the compiler works in general. Do you have any pointers (pun intended) for where to get started?

eholk commented 6 years ago

I think starting with schemeFromJs is a good starting point. You'll basically want to write the inverse of jsFromScheme.

You might also look at apply-representation, which is the part where Schism actually converts all the data into tagged pointers.

As far as data goes, everything is a tagged pointer packed into an i32. The high 29 bits are the actual value, and the low three bits are the tag. There are tags for numbers, constants, pairs, characters, strings, symbols, and closures so far. For numbers, constants and characters, there is no in-memory part; the data is completely stored in the pointer. The value part of pairs point to two consecutive words in memory, which themselves are each tagged pointers. Strings are just lists of characters, but instead of having a pair tag, they have a string tag. If you find the implementation of string->list and list->string, you'll notice all those do is change the tag.

Symbols are a little trickier. There's a linked list that keeps track of all the symbols. Things tagged with symbol point into this list. Each entry either has a string, if it's a normal symbol, or #f if it was a gensym. This means we can compare equality of two symbols just by comparing their pointers, we can convert between strings and symbols, but gensyms are all distinct from all other symbols.

For how the compiler works, it is a set of passes that start with Scheme source code and then transform it into lower level representations until finally we can generate WebAssembly from it. A lot of Scheme's higher level features get eliminated pretty early in the compiler. This means the backend can be simpler because the language is much smaller at that point. You can see the passes and what order they are called in by looking at the compile-library function.

I realize the documentation could be a lot better, but hopefully this is enough to point you in the right direction. Feel free to keep asking questions!

matthewp commented 6 years ago

I can't thank you enough for taking your time to teach me about all of this, it helps tremendously.

As far as testing the interop is concerned, it looks like all of the current tests just test for truthy return values. What I'm thinking of doing is following your naming convention and adding in a JavaScript module that can do the testing. Maybe something like:

add-nums.ss

(library
    (trivial)
  (export do-test)
  (import (rnrs))

  (define (do-test)
    (+ 1 2)))

add-nums.mjs

export function test(wasm, engine, assert) {
  // ... do testing here
};

Not sure exactly what it will look like, but something like that.

eholk commented 6 years ago

So the idea would be that the JS test function would verify that the Scheme do-test returns 3?

That seems like a good idea. Most of the tests so far are self-testing, which generally gives good compiler coverage but can miss some cases where the compiler generates incorrect code but just happens to work.

I think more extensive testing of JS/Scheme integration is a great idea!