Closed tlively closed 1 year ago
It is already the case that extern.internalize
can produce arbitrary subtypes of anyref
, because the operand may be the result of extern.externalize
applied to a Wasm value. Moreover, nothing in the core spec can prevent a host from producing equivalent values on its own, that did not originate from Wasm. Consequently, i31ref does not add anything new in that regard – when Wasm code is downcasting the result of internalize of a value that came from the host, then it is fundamentally depending on embedder-specific behaviour.
I agree that internalizing host values is inherently going to be host-dependent behavior. What matters is the internalization of externalized Wasm values. Perhaps we should specify that internalize(externalize(val)) == val
?
@titzer, that will follow from the semantics. (Both operations are merely representation changes between two isomorphic types, semantically they are no-ops. Consequently, both internalize ∘ externalize and externalize ∘ internalize are the identity.)
Hmm, I don't quite see how the Wasm specification will achieve that it if it specifies neither the representation of host values nor the mapping of Wasm values to host values. Probably splitting hairs here, but I don't see how the specification can say anything more precise than the internalize ∘ externalize is identity, since it seems host value equivalence can't be specified by Wasm either.
Good point that we already get this situation from internalizing externalized values. It still seems possible for there to exist a compatibility hazard, but the solution would be to standardize the internalized types of values above the layer of the Wasm core spec, just like the JS embedding spec does. For example, if a WASI API returns an externref
, it should probably also specify the precise internalized type of that externref as well. That doesn't play well with virtualization, though, so it would actually have to specify only an upper bound on the internalized type, and then the compat hazard becomes possible again.
@titzer, as far as the core spec is concerned, the universe of references under type anyref
is extended with an internal ref.extern a
, where a
is some abstract host address. So yes, it does not know anything about proper host references.
But the universe of references under type externref
will be exactly the same. That is, it also includes abstract proper host references, as well as externalised Wasm references, which are distinguished and not abstract. That way, we can trivially specify the bijection between both types that is observed by in/externalize (which we need to be able to specify).
The rest then is up to each API spec, which has to define which sort of Wasm reference its own values are mapped to at the boundary. For JS, some JS values are mapped to Wasm references (small ints, Wasm exotic objects, both of which Wasm can observe) while most others are mapped to proper extern references.
Is there any more to say about the potential compat hazard here, or can this be closed?
Closing, feel free to reopen for further discussion.
In the JS embedding,
extern.internalize
returns values dynamically typed asanyref
ori31ref
. This suggests that other embedders would be free to return values with any other subtype ofanyref
as well. For example, an embedder might choose to internalize values into concrete struct or array types rather than opaque values. Portable modules must not assume internalized references are anything more specific thananyref
, although it would be possible to write non-portable modules that do make assumptions, for example by always casting the result ofextern.internalize
toi31ref
or some other type.Are we concerned about different embeddings having different
extern.internalize
behavior being a compatibility hazard?