Open wingo opened 2 years ago
the current spec has this property right now and we should preserve it
I may misunderstand what exact property you mean. But the spec does in no way guarantee (or intend to guarantee) that every value under anyref can be classified. For starters, you can't recognise host references. There is nothing fundamentally wrong with introducing Wasm references that you cannot classify dynamically either, yet they are anyref. In fact, that's highly preferable to introducing additional top types, because each of those would form its own hierarchy and would have to come with a new bottom type as well.
My imprecision stems from my ignorance :) Backing up a bit, I think you would want stringref
values to be classifiable (if I understand your term correctly). But you also might want stringview_wtf16
to have the same representation as strings, in a browser. In my mind the solution here is that a stringref
can be held in an anyref
-typed location but that a stringview_wtf16
cannot. Therefore if you see a v8::String
value, you know it's a stringref
and not a stringview_wtf16
. Is this a reasonable thing to want, @rossberg ?
Okay, I see, that's a problem. Technically, I agree your suggestion would be a solution. But one that adds significant complexity to the type system. Can't say that I'd get excited about that.
An alternative would be not to treat views as reference types but as a new, third category of value type that's neither numeric nor reference. Not sure that's simpler, though, it's probably even worse.
Personally, I would rather avoid them sharing a representation. Not least because that results in an implementation-dependent cost model. But I fear that's the case for views already?
To a degree, I think the question of implementation-dependent cost models is just a thing we have to deal with, for better or for worse. In an implementation using encoding X internally, it will be cheaper (and indeed possibly free) to obtain a view on a string's contents for encoding X than encoding Y. I think it's a fundamental aspect of this particular local maximum in the design space.
I am sympathetic to the type system complexity question, of course.
Just per J2WASM experience, stringref being subtype of anyref doesn't really help us since in our type system we cannot use any
as the top type; as the top type needs to have properties like toString
, equals
and hashCode
. As a result we have a wrapper types for things like Strings and Arrays.
This is in contrast to our modeling with J2CL where we backed things with JS types without wrappers where applicable. This was critical for our jsinterop story which was main driver for the compiler. This resulted in having trampolines on these top level methods to handle various mapped JS builtins however that is unlikely something we can adapt in the J2WASM case.
Some experimentation with V8 shows that in the current implementation, stringview types are their own top types rather than being subtypes of any
, but that they are also all supertypes of none
, so they're not quite in their own hierarchy.
For the time being I've changed Binaryen's implementation to match this (https://github.com/WebAssembly/binaryen/pull/6440), but if this proposal ever gets revived, it would be good to properly separate the stringviews from the any
hierarchy by giving them their own bottom types.
The current V8 implementation also disallows casts to stringview types, but this is inconsistent with the final WasmGC spec, where any reference type can be the target of a cast. It would be better to be consistent and allow casts to stringview types; if they're properly separated into their own type hierarchies, they would still be implementable as unmodified string
.
I'm discovering now that it seems there is no way to store a stringref
in an anyref
. If true, for our work-in-progress implementation of Scala-to-Wasm, that would be a total blocker. It doesn't have to be a proper subtype, but at least it should have O(1) conversion operations like any.convert_extern
and extern.convert_any
. And for us to be able to use stringref
at all, we would need such values to be seen by the JS embedding as JS string
s (like i31ref
s are guaranteed by spec to be seen as JS number
s in the appropriate range, and vice versa).
We indeed have a universal representation of types. Unlike what was said about J2WASM above, in Scala-to-Wasm we do not compromise on our JS interop story, even when compiling to Wasm. That means our universal representation must be able to store values in a way that, when crossing the JS embedding, map to the corresponding JS types. The externref/anyref
equivalence guaranteed by the JS embedding for GC is a critical property for us. Can we get something similar for stringref/anyref
?
(We are not yet attempting to use the stringref
proposal; currently we use actual JS strings, but I very much hope to be able to use stringref
in the future.)
@sjrd, in case you didn't know, this proposal is essentially on hold in favor of JS string builtins, so you should focus on that proposal instead.
Oh thanks for the info. I'm also following the JS string builtins proposal, but I didn't know that it was likely to supersede stringref
.
FWIW the workaround for stringref <-> anyref, such as it is, is to allocate a (struct (ref string))
wrapper. Not ideal!
FWIW the workaround for stringref <-> anyref, such as it is, is to allocate a
(struct (ref string))
wrapper. Not ideal!
That would not work for us, because when that (struct (ref string))
is given to JS through the JS embedding, JS will see an opaque Wasm object, rather than a JS string
.
That's why, for example, we do not wrap our f64
s into (struct (f64))
. Instead we go through a JS function
function boxDouble(x) {
return x;
}
that we import in Wasm as a f64 -> anyref
. This way, we can put in our universal anyref
representation something that, if given to JS, is actually a number
.
The current V8 implementation assumes that stringref
is a subtype of anyref
.
(It seems to have been introduced in this PR.)
Right, but not the stringview*
types.
For languages with a universal value representation, it would be nice to be able to have a list of "any", then pull out the individual values and do some type dispatch on those values. Strings are a fundamental kind of value, so they should support this idiom. The whole type hierarchy is in flux over at the GC proposal but at the minimum we should support the
ref.as_string
,br_on_string
,ref.is_string
set of instructions, in whatever form those instructions end up landing (https://github.com/WebAssembly/gc/issues/274).On the other hand we really really want to avoid having to do this for
stringview_wtf8
,stringview_wtf16
, andstringview_iter
. We expect that on a run-time implementation that represents strings as WTF-8, that astringview_wtf8
will just be the string itself, and likewise for WTF-16 systems andstringview_wtf16
. You wouldn't be able to dynamically dispatch on the view to know its type, because stringref shares a representation. Not sure exactly how to make this happen on the spec level but the current spec has this property right now and we should preserve it :)Related to #3.