WebAssembly / interface-types

Other
641 stars 57 forks source link

Binding to host provided String, Array, Set, Map, Error #11

Closed dcodeIO closed 5 years ago

dcodeIO commented 6 years ago

When thinking about how host bindings might become useful, a few questions come up in my mind. For example, will it be possible to ...

Context: If all of these would be possible, something like AssemblyScript could just use WASM<->JS interchangeable object handles directly for pretty much everything crossing the boundary instead of re-implementing a standard library on top of linear memory.

lukewagner commented 6 years ago

Bind to the String constructor, providing UTF8 bytes, returning an object handle? (probably)

Yes, using the STRING host binding to allow wasm to pass a (begin, length) i32 pair as a string to the String ctor.

Test two such String object handles for equality, though there is no method to bind to / without a JS helper?

I think we've discussed elsewhere having an anyref.eq (symmetric to i32.eq).

Bind to the Array constructor, providing an initial set of elements as variable arguments (0 to a lot), returning an object handle?

Yes, although, concretely, I think you'd want to import Array.from, passing a typed array iterable created by a variant of the ARRAY_BUFFER binding that let wasm create a typed array view from a (begin, length) i32 pair.

Bind to the Array constructor, providing an initial set of elements as variable arguments (0 to a lot), returning an object handle?

Reflect.get could be imported to perform a general property get on any object, arrays included. Reflect.get does ToString on its propertyKey argument, but if an i32 index was passed, this could be optimized.

Obtain and work with an iterator object handle on a Set or Map?

Yes, although the value would be opaque and only useful to pass to imported Set/Map methods.

If all of these would be possible, something like AssemblyScript could just use WASM<->JS interchangeable object handles directly for pretty much everything crossing the boundary instead of re-implementing a standard library on top of linear memory.

There might still be speed advantages staying in linear memory, both in avoiding GC and in avoiding the indirection of calling through imports (which all of the above require). But agreed that it could be quite useful in this context.

dcodeIO commented 6 years ago

I think we've discussed elsewhere having an anyref.eq (symmetric to i32.eq).

So, in this case, anyref.eq would test the two provided strings for equality (like calling the VMs hidden string comparison implementation that one cannot directly bind to), not the reference? Asking because I'd like to avoid importing something like a stringEquals helper.

Reflect.get could be imported to perform a general property get on any object, arrays included. Reflect.get does ToString on its propertyKey argument, but if an i32 index was passed, this could be optimized.

That's an interesting idea, though it makes me worry about efficiency a bit. Maybe this and the string question above can be generalized to "binding to overloaded operators that do not have a functional counterpart" (here: equals or indexed access). If JS just had String#equals or Array#get/set that'd be a non-issue, like it is for String#concat.

There might still be speed advantages staying in linear memory, both in avoiding GC and in avoiding the indirection of calling through imports (which all of the above require)

Yeah, just curious about the "things being passed hence and forth anyway" case. Guess I hoped that the penalty for binding to these things was negligible.

The reason why this is to tempting to me is that it would allow to produce maximally JS-compatible, though minimal modules with ease, where just specific performance-critical parts take advantage of what WASM provides.

lukewagner commented 6 years ago

So, in this case, anyref.eq would test the two provided strings for equality (like calling the VMs hidden string comparison implementation that one cannot directly bind to), not the reference?

Oh sorry, I didn't notice this important aspect of the question. Right, so JS strings probably wouldn't be valid anyref types because they don't have reference semantics. In the other issue which brought up anyref, we discussed a more inclusive any type that could contain any JS value, including strings. But probably more discussion needed on this question.

lukewagner commented 6 years ago

That's an interesting idea, though it makes me worry about efficiency a bit.

Yeah, this wouldn't be nearly as fast as a properly-JIT-optimized property access. By having all accesses go through an import call, you basically force property access into the general megamorphic case which does a real dictionary lookup. To do better, we'd need to consider adding actual JS object operations to wasm which is definitely a separate feature from Host Bindings.

magcius commented 6 years ago

Asking because I'd like to avoid importing something like a stringEquals helper.

This would be nice, but I'd rather focus on the stuff we can do right now and expand later. I imagine host-bindings will make a good portion of the bootstrap runtime go away, but not all of it. We don't have to solve the "I want to bind every JS operator" case if we can already call out to JS through imports.

magcius commented 6 years ago

To do better, we'd need to consider adding actual JS object operations to wasm which is definitely a separate feature from Host Bindings.

There's a valid question if host-bindings should strive for efficient and maximally performant bindings, or just capable ones. The spec says "Speed - Allow JS/DOM or other host calls to be well optimized." is a goal, but does that mean that any reasonable host-bindings implementation should be able to inline such calls?

lukewagner commented 6 years ago

@magcius It's a good question, and I suggest 'no'. There's an inherent conflict between the separate-compilation model of wasm and fully optimizing calls to imports. The former allows modules' code to be cached and instantiated multiple times, cheaply. The latter requires specializing a module to a specific set of imports. Maintaining separate compilation, we can compile an import call to an indirect call in the machine code and then, at instantiation time, link that indirect call to an import-specialized stub. But to get anywhere close to matching JS get/set-property perf, we'd have to not just inline, but apply full battery of profile-directed optimizations that involve bailouts and invalidations. So I think the expectation for "well optimized" here should indirect call + cheap trampoline into host C++.

dcodeIO commented 6 years ago

Object.is seems like a viable candidate for an initial equality binding. Would that work with two String object handles?

Another thought:

Reflect.get could be imported to perform a general property get on any object, arrays included. Reflect.get does ToString on its propertyKey argument, but if an i32 index was passed, this could be optimized.

Not sure if some of these things might even make good ECMAScript proposals, but if, let's say, Reflect.equals / Reflect.strictEquals were a thing (in the lines of Reflect.get / Reflect.set), these would enable all sorts of host bindings right away (with no changes to host bindings at all) and still leave the door open for future optimizations as mentioned.

From a nicer-API perspective, these would cover (hopefully) everything without using reflection:

// left == right
Object.equals(left, right)

// left != right
Object.notEquals(left, right)

// left === right
Object.strictEquals(left, right)

// left !== right
Object.notStrictEquals(left, right)

// obj[key]
Object.get(obj, key)

// obj[key] = value
Object.set(obj, key, value)

// left + right
String.concat(left, right)

// arr[index]
Array.get(arr, index)

// arr[index] = value
Array.set(arr, index, value)
pchickey commented 5 years ago

Closing as out-of-date: these concepts don't map to the current proposal, which has evolved a lot since this issue was opened.