taralx commented 4 years ago

This is a continuation of #94 with a more fleshed-out proposal. Many thanks to @RossTate for helping me understand the underlying issues and honestly doing most of the drafting. 😄

Garbage Collection with Two Types

This is a description of an MVP for GC that would require just two types: structref and tagref. This is not meant to be the end-all-be-all for GC; it is specifically designed to be an MVP. In particular, it is kept very simple both to make it easy to implement and to make it easy to translate into a more advanced version-2 GC proposal.

Structure References

A structref is a reference to a mutable data structure that is simply an array of reference data and an array of scalar (i.e. linear-memory) data. The expectation is that a structref is implemented as a butterfly, i.e. a pointer to the "center" of a data structure, with one of the two arrays stored at negative offsets from that center and the other stored at positive offsets. How large these arrays can be and how scalar data is specifically represented are details that are worth discussion at some point, but which we believe should be deferred at the moment. As a starting point, we suggest scalar data be represented as an array of i32 fields, and that both reference and scalar arrays each have a maximum size of 2³⁰ (i.e. maximum index of 2³⁰-1). (Alternatively, the scalar data could be accessed with instructions akin to linear memory; we use i32 "fields" here because it makes the examples simpler.)

In order to avoid every access involving an array-bounds check, we include the following instructions:

structref.assert_bounds : [structref i32 i32] -> []
- traps if the given structref has strictly fewer than the first number of reference fields or strictly fewer than the second number of scalar fields
structref.br_unless_bounds $label : [structref i32 i32] -> []
- branches to label $label : [] iff the given structref has strictly fewer than the first number of reference fields or strictly fewer than the second number of scalar fields

The expectation is that, after these instructions, even a streaming compiler can eliminate the bounds checks for accesses to appropriate fields. However, because this proposal is specifically an MVP, and because in the worst case superfluous bounds checks can be performed relatively quickly, we do not propose incorporating these bounds directly into type information. That is, structref is a type with no parameters.

The remaining structref instructions are the following:

structref.new : [i32 i32] -> [structref]
- allocates a structref with the first number of reference fields and the second number of scalar fields
- traps if either of those numbers are too large
structref.get_bounds : [structref] -> [i32 i32]
- the first number is the number of reference fields and the second number is the number of scalar fields
structref.get_i32 : [structref i32] -> [i32]
structref.set_i32 : [i32 structref i32] -> []
structref.get_tagref : [structref i32] -> [tagref]
structref.set_tagref : [tagref structref i32] -> []
structref.eq : [structref structref] -> [i32]

Note that the last instruction means that structure references have a notion of identity, which makes sense given that they are fundamentally mutable data structures.

Tagged References

Note that the reference fields of a structref have type tagref. A tagref is a tagged reference, meaning it has some tag and a payload corresponding to that tag. For this, we repurpose events from the exception-handling proposal.

The tagref instructions are the following:

tagref.pack $tag : [t*] -> [tagref]
- where event $tag : [t*]
tagref.unpack $tag : [tagref] -> [t*]
- traps if the tagref was not packed with $tag
tagref.test $tag : [tagref] -> [i32]
tagref.br_unless_unpack $tag $label : [tagref] -> [t*]
- where event $tag : [t*]

Note that there is no way to mutate the contents of a tagged reference or to compare them for equality. This enables a number of optimizations, and in particular means that we can use canonical tags instead of coercion instructions to and from each other primitive type.

Canonical Tags

We propose that some types should have canonical tags. Although it is worth discussing which types specifically should have a canonical tag, we believe this discussion should be deferred, and so as a starting point we propose the following types: structref and funcref. We elide a canonical tag for i32 for now to avoid discussions such as for i31ref.

The reason to have canonical tags is to enable specialized representations for canonical cases. This avoids the need to always box and unbox particularly common reference values in order to put them into the reference fields of structure references. It is even possible put canonical-tag information directly into the pointer so that no memory accesses are necessary to unpack these canonical cases.

Note that the call tags proposal (briefly known as dispatch tags) makes casting funcref to a more precise type unnecessary.

Imports and Exports

Although not strictly necessary, this proposal is positioned to benefit significantly from compile-time imports and exports. For example, a module for some Java class could request a compile-time import for the number of reference and scalar fields its superclass defined in some other module has, and then use that to compute the offsets at which its own fields should be located. By having this import at compile-time, these offsets would be constant expressions and easily folded, both for assert_bounds instructions and for field-access instructions, thereby eliminating many bounds checks that would be difficult for any simple static type system to eliminate.

This proposal also has no need to restrict type imports and exports to just reference types (and no need for subtyping in general). One can simply import/export an arbitrary type along with a preferred tag to use for coercing it to and from tagref. If a module wants its exported type to be unexaminable, then the exported tag can be one created by the module and only exported with an abstract signature, or it can choose not to export a tag at all. If a module does not care about abstraction, then the exported tag can be the canonical one for the exported type. Either way, this proposal guarantees type abstraction by default.

Examples

Java/C#/Kotlin/Scala Object Layout

An object is represented as a structref. Its first scalar field is the numerical identity of the object (unless we choose to add numerical identity directly to structref in general). Its first reference field is the runtime info for the object's class (more on that below). The remaining fields is the fields of the class. (For C#, an object's runtime instantiation's of a class's type parameters is included in these fields.) Due to the butterfly representation, subclasses can add more reference fields and more scalar fields without issue. If the object is an array, then the array elements can simply be stored directly within the object. All reference field values is tagged with the canonical tag for structref.

A class runtime is also represented as a `structref` with various "sections". Some sections can be statically guaranteed to be at a particular offset, whereas some sections have a scalar field (at a statically known offset) that indicates where the section begins in the `structref`:	Reference Fields	Scalar Fields
interface-method table (19 funcrefs using `func_switch`) (relevant paper for background)	index of superclass list
virtual-method table (funcrefs)	index of implemented-interface list
superclass list (structrefs)	index of reflection information
implemented-interface list (structrefs)	(Scala) index of inherited-trait list (C#) index of type-argument list
Reflection information (structrefs)
(Scala) inherited-trait list (structrefs) (C#) type-argument list (structrefs)

Note also that this representation illustrates how this proposal supports compacting multiple variable-sized components into one memory structure, as is often done in runtimes.

In #102, @skuzmich raises the issue of supporting separate compilation to WebAssembly. For abstracting the size of superclasses and the location(s) of fields/methods, a module could import and export the sizes of various non-private classes as well as the offsets of various non-private fields, ideally at compile-time. For concealing information from other modules of the same Java/C#/Kotlin/Scala program, one could use custom tags for private fields. For concealing information from modules outside of the Java/C#/Kotlin/Scala program, one could use a custom tag (shared across all the modules within the program) for sealing objects in order to protect data from untrusted code. At the same time, one could specify prototypes in JavaScript that make these sealed values look like JavaScript values with the relevant methods.

OCaml

In #100, there was discussion of how to implement OCaml in the various GC proposals, and in this comment @sabine laid out roughly the following strategy using (the precursor to) this proposal (where structref m n indicates m reference fields and n scalar i32 fields):

(event $boxed32 (param i32))
(event $boxed64 (param i64))

value: tagref (unboxed immediate or pointer to a block)
  32-bit immediate: $boxed32
  64-bit immediate: $boxed64
  block pointer: canonical tag for structref
    tuple<n> block:     (structref n 1)
    closure block:      (structref (1+m) (1+n)    (one ref for funcref, one i32 for tag, the rest for environment)
    array block:        (structref len 1)
    32bit-array block: (structref 0 (1+len))
    64bit-array block: (structref 0 (1+2*len))
    string block:       (structref 0 (1+(len+3)/4))

There are a few changes to @sabine's design in this example:

Use of custom tags to encode immediates
Closures use the dynamic arity technique that the call tags proposal supports (which enables closures to capture unboxed scalar values in the environment)
Consolidation of tuple<n> and record<n> blocks, leaving the tagging information in the scalar field to distinguish the two when necessary
An array for 32-bit scalars, akin to the preexisting one for 64-bit scalars
Use of i32 fields rather than bytes (a temporary detail)

A key property of this representation is that it supports efficient polymorphic structural operations. Checking whether a value is a structref is quick, as is checking whether that structref has at least one scalar field. That scalar field then provides a tagging information (at the OCaml level, rather than the WebAssembly level), which informs the structural comparison how to proceed. Typically this will involve comparing the scalar fields or recursively comparing the reference fields, which this proposal makes easy to iterate through.

Interior Pointers, Nested Structures, and Array Slices

In #59, @aykevl discusses Go's need for interior pointers, and in #77, @vargaz discusses .NET's need for interior pointers. Although this proposal does not support true interior pointers (leaving that for some future extension), it does support fat interior pointers, which in #59 @aykevl says would be sufficient for now. For example, an int32* in Go could be represented by a pair [structref, i32] indicating a structref and a specific offset within the scalar fields for the referenced int32. Note that this even supports pointers to just past the end of array, which @vargaz says is important in #77.

Both Go and C# provide nested structures. This proposal supports nested structures by splitting their reference and scalar fields across the two wings of the structref butterfly. In turn, an interior pointer to a nested structure comprosed of both reference and scalar data in Go/C# would be represented by a triple [structref, i32, i32], with one i32 indicating the start of the reference components of the structure within the structref, and the other indicating the start of the scalar components.

Similarly, Go has array slices, which can be represented by a structref paired with multiple i32 indicating the start and end/length of the array slice's reference and/or scalar components.

Looking Ahead

Due to its simplicity, this proposal is easy to embed in (slight extensions to) either pre-existing proposal. It does not commit us to either path. We hope that it will provide a simple and flexible target for many languages, and after building up a community and corpus we can do analyses to inform descriptive type systems capturing the usage patterns in the wild. A subsequent version of Garbage Collection can be developed to optimize for these patterns, providing more compactness, better performance, and improved composability and abstraction for real systems.

aardappel commented 4 years ago

I'd say you definitely need to make the linear memory portion byte-addressed, as accessing i64 and f64 on an array of i32 is going to weird, and we might as well have the option to store i16 and i8 quantities directly.

It is not clearly explained why we need tags at this level of generality. Certainly, most structref don't need a tag, as this is just a singular type. If a structref needs RTTI of some sort, the language can choose to stick an id in the linear memory portion, or a vtable in the ref portion themselves. So I presume the primary use case is tagged scalars? Maybe they should be supported more explicitly, since an unbounded amount of tags is going to require a tagref to take up 64-bits or so in wasm32 presumably?

Having the assert_bounds is a nice stopgap measure to reduce the cost of bounds checking, but I would recommend getting these to be part of the type system as early as possible, maybe part of the MVP. This need not be complex, since a lazy producer can use a single structref<0,0> type for everything, and get the same result as not having types (each access will need a boundcheck)`, but types will allow actual bounds on e.g. function arguments, making sure the validator can check the bound were already checked previously.

kmiller68 commented 4 years ago

It is not clearly explained why we need tags at this level of generality. Certainly, most structref don't need a tag, as this is just a singular type. If a structref needs RTTI of some sort, the language can choose to stick an id in the linear memory portion, or a vtable in the ref portion themselves. So I presume the primary use case is tagged scalars? Maybe they should be supported more explicitly, since an unbounded amount of tags is going to require a tagref to take up 64-bits or so in wasm32 presumably?

I think the tag is only for downcasting from tagref to funcref/structref but perhaps I misread. IIUC, there are only two tags which are effectively functag and structtag. My understanding is that using a scalar in the structrefs "linear memory" and calling a producer-provided downcasting WASM function is the expected implementation in the MVP version of this proposal, as you said.

fgmccabe commented 4 years ago

Re Interior pointers. The fatness of the point is more than advertised:

There needs to be both reference and scalar aspects
There needs to be an upper and a lower bound in both dimensions. Otherwise, one could escape the implied bounds of the interior pointer. In order to compile with some semblance of code size, one would have to use interior pointers everywhere in compiled code.

kripken commented 4 years ago

@taralx Very interesting proposal!

The expectation is that, after these instructions [assert_bounds, br_unless_bounds], even a streaming compiler can eliminate the bounds checks for accesses to appropriate fields.

I think these instructions are a great idea, but for optimizing compilers actually! (for streaming ones, the small overhead of bounds checks is probably not a big deal I suspect, given the other overhead) Specifically, they can help in two areas that have come up in discussions of other topics. First,

read 1
read 2
read 3

As this code stands, the VM must bounds check each line because if a trap occurs it must happen on the proper line. However, if we do

assert ok to read 1, 2, 3
read 1
read 2
read 3

then the VM can check once up front, trap if it needs to there, and then not worry about it.

Or maybe the compiler can see such a sequence and optimistically check up front that 3 is ok, and if not, only then iterate? But identifying such sequences is not trivial if there are ifs along the way,

read 1
if A read 2
if B read 3

This makes it even harder to guess. Instead, an explicit assertion solves the issue.

taralx commented 4 years ago

@aardappel:

I'd say you definitely need to make the linear memory portion byte-addressed, as accessing i64 and f64 on an array of i32 is going to weird, and we might as well have the option to store i16 and i8 quantities directly.

Agreed that we need almost certainly need something more than "array of i32". Possibly a full parallel set to (or variant encoding of) the existing linear memory load/store operations.

It is not clearly explained why we need tags at this level of generality.

As mentioned by others, tagref is a sum type (a.k.a. discriminated union). We define two canonical tags (funcref and structref) here, but we expect that we will want to add additional canonical tags before release. In addition, the user can define custom tags. The methods of implementation are varied; some implementations may have to box the tag and value, others may be able to pack them together. It is expected that canonical tags get better treatment, e.g. tagref.pack funcref being a no-op.

I would recommend getting [bounds] to be part of the type system as early as possible.

If this were going to be the final form of GC, we would absolutely do that. But this is an MVP, and adding bounds to the type risks severely complicating the type checking. So in the spirit of minimality, I recommend keeping them out. As you noted, they can be relatively easily added in later.

taralx commented 4 years ago

@fgmccabe:

There needs to be both reference and scalar aspects

I'm not clear on where this would be necessary for typed interior pointers? A fat pointer to a reference type would use an index into the reference array instead of an index into the scalar array.

There needs to be an upper and a lower bound in both dimensions

Which language are you thinking of here? Go doesn't allow offsetting of interior pointers (slices are a different matter, and they are already fat in the way you describe). C# does, but only in unsafe code where it is expected that the code can break the bounds of the interior pointer.

taralx commented 4 years ago

@kripken Agreed. We mentioned streaming compilers because we expect that optimizing compilers will often find that the region between accesses is sufficiently side-effect free and that the bounds checks can therefore be moved up. Streaming compilers would be much less likely to be able to do this.

kripken commented 4 years ago

@taralx The accesses themselves have the potential side effect of trapping, though? For that reason I don't think an optimizing compiler can optimize things like load1; load2;. The only things it could are load1; load1; (same check) or load2; load1; (first check includes the second). So I think your proposed instructions are very important.

RossTate commented 4 years ago

@kripken Yes, you have the motivation for assert_bounds correct.

@aardappel @kmiller68 The proposal does not prescribe an implementation of tags. That said, what we had in mind was that the canonical tags would typically be embedded directly in the pointer, whereas non-canonical tags would be in the heap. Something I'm realizing isn't clear in the write up is that a tag can in general be any event (as in exception events from the the exception-handling proposal), so applications can specify their own tags in addition to the canonical ones. We use custom tags in the OCaml example and in the plan for tieing in with abstract imported types (without requiring those imported types to be reference types).

(As a meta note, the name "event" may not be the best. It's tied to a specific application, whereas the application is more general. Something like "data tag", complementary to "call tag", might be a better fit. But that issue is clearly orthogonal to this proposal besides the confusing terminology "event" creates here.)

titzer commented 4 years ago

This is effectively a dynamically-typed object system. I do not see how this can be made performance-competitive with, e.g. a typical implementation of Java's object model in any modern JVM. Reading a (tagged) reference field will yield an untyped tagged reference that needs both a tag check (presumably to crash if not an object) plus further bounds checking. That is prohibitive.

The motivation for a dynamically-typed object system also seems unclear. WebAssembly engines for the web are already embedded in JavaScript virtual machines that will do an even better job (through complex dynamic optimization) than what is proposed here, and the model is more flexible. What value add is there of a dynamically-typed object system in that case?

RossTate commented 4 years ago

@titzer You seem to be under the impression that the current MVP is not dynamically typed. That's because on the surface it appears to be, but once you consider use cases in more depth you run into problems that effectively force you to use dynamic typing. That is why I have been asking for detailed case studies showing how the MVP would implement complete sets of interacting features.

For example, because the current MVP has no plan to address the space-abstraction problem I presented at the in-person meeting, in #102 @skuzmich has explained that he will be representing Kotlin objects as arrays of anyrefs, where the first element is a reference to a vtable (that itself is an array of anyrefs), and the subsequent elements are references to structures of each class's non-inherited fields. So a field access involves an bounds-checked load followed by a dynamic cast (which itself involves a bounds-checked load) and then typed load. A method access involves a bounds-checked load, and then a dynamic cast (to get the vtable), and then a bounds-checked load followed by a dynamic cast (to get the declaring class's method table), and then a typed load (and after all that, the callee will still only know that the "this" pointer is an array of anyrefs). And all that indirection still doesn't address changes to which superclass declares a field or method.

On the other hand, in this proposal a field access satisfying the needs in #102 is just a bounds-checked load possibly followed by a canonical tag check, which at worst is a load and at best is a bit-flag check in the pointer.

Similar problems happen for pretty much every other language for some reason or another. In C#, an int[] has to be represented as a reference to an array of anyref that in tern have to be dynamically cast to get the contained int32 value. In Go and C#, an interior pointer has to be represented as a reference to an anyref and a function reference that knows how to get and update the relevant value within that anyref. In OCaml, every value has to be an anyref in order to take advantage of i31ref, and even without that every value can at best be a reference to an integer field that then informs what to further dynamically cast the value to. (I can dig up the links to the relevant comments some other time.)

Note that all these cases use frequent dynamic casts, and in the current MVP those dynamic casts are implemented using a load of an array and then a bounds-checked load followed by an equality check, i.e. a bounds-checked double indirection. Among casting mechanism, this is one of the slower ones, especially considering how frequently it will have to be used. And we can try to rely on (inline) caching techniques to make up for these shortcomings, but my understanding is that WebAssembly is not supposed to rely on speculative optimization for its performance.

To summarize, the current MVP (and likely any MVP) is not expressive enough to avoid extremely frequent casting. Our proposal embraces that inevitability and provides flexible representation with efficient casts. The current MVP, on the other hand, fails to recognize the limitations of its type system, forcing people to jump through hoops to fit their needs within it, and is designed around a slow casting mechanism that makes these limitations all the more painful.

RossTate commented 4 years ago

Also, while the simplicity of the proposal here leaves many options open for extending it with type information to eliminate various (efficient) casts, the complexity of the current MVP makes it unclear that it is possible to do the same to eliminate various (inefficient) casts. For example, there was a plan for extending the current MVP with existential types to eliminate some superfluous casts. But that was based on 20-year-old research that had been abandoned by the relevant research community when no one was able to figure out how to make it scale past toy languages and systems. It was unable to express common features (like typed arrays), too hard to generate (a PhD per compiler), and too bulky (type-annotation size was as large as code size). In fact, the entire long line of research on using structural types in low-level systems to describe the heap (as the current MVP does) was abandoned because its advocates were never able to overcome these challenges.

To summarize, whereas the proposal here is easy to extend, there is good reason to believe the current MVP is not extensible in a significantly effective manner.

conrad-watt commented 4 years ago

For example, the last time we had tried having a meeting to critically discuss the current MVP, we were told to wait because we would be seeing a presentation at the in-person meeting on a plan for extending the current MVP with existential types to eliminate some superfluous casts. But that never happened. I suspect the reason is that it was pointed out that that plan was based on 20-year-old research that had abandoned by the relevant research community when no one was able to figure out how to make it scale past toy languages and systems.

I don't think this is fair. IIRC, the in-person GC proposal timeslot ended up being taken up entirely by the fundamental nominal vs structural debate which had been raised by the SOIL proposal - there's no need to imply a deeper motive for why other discussions didn't take place. FWIW I think using the time this way was the right call.

RossTate commented 4 years ago

The night before the meeting, I was sent a slide deck titled "Casts for Low-level GC" on existential types to be presented the next day. I then gave the background information on the known weaknesses of the underlying theory of the plan. Then the slides were never presented. Though admittedly I was venting some frustration over this, I did not mean to imply a deeper motive, and I'm not entirely sure what seems to be implied. But apparently there is an unintended implication, so thanks for pointing it out; I have revised my comment to remove it.

titzer commented 4 years ago

There is an undercurrent to the discussion that the current MVP is inadequate to describe object models akin to Java with the same level of efficiency as modern JVMs. We need to clarify the bounds of that inadequacy in order to point us in the proper direction. In particular,

In the current MVP, the virtual dispatch mechanism for single-inheritance object systems (like Java invokevirtual) requires a dynamic cast, barring existential types
The interface dispatch mechanism for multiple (non-implementation) inheritance (C#, Java, Kotlin interfaces) has not been sufficiently studied in the current MVP
According to some separate compilation models, field inheritance for single-inheritance object systems across module boundaries requires indirection or additional import mechanisms, e.g. to import field offsets or struct sizes

@RossTate what you seem to be saying is that the dynamic check for virtual dispatch here is a kind of dynamic typing, and I agree. But I don't agree that by making the object system even more dynamic, we are solving any real problems. In fact, this is going the wrong direction, because it is even less efficient at its best, including the simple case of just having unboxed primitive fields in structures and structures of fixed size. It is performance prohibitive because it will be significantly slower and more heavyweight than a typical JVM's object model.

I raised an important point earlier but maybe it was lost in the noise. There is already an efficient dynamically-typed object model as a target for web compilation: JavaScript. Without a value-add over that object system in any way, a dynamic object system in Wasm is just adding more engine complexity without accomplishing the goal. And what is that goal? Reaching competitive performance with native language implementations like the JVM. In particular, I believe that if the GC MVP cannot deliver a low-overhead object model competitive with the JVM, then it's a no-go. This proposal in particular would be less efficient than what a JavaScript engine would come up with after speculative optimization, which is itself less efficient than what a JVM would do. So instead of getting better than both of them, it would get worse than both of them. That's moving in the wrong direction.

We need to be clear-eyed about where all this new trouble is coming from. The third point above identifies separate compilation as a source of new requirements. These problems are not inherent in the current MVP. The problems with inheritance/extensibility across the module boundary are an outcome of a new requirement that lowered code can interoperate across a module boundary. What is "lowered" code? What I mean by that is that language operations such as invokevirtual, getfield, etc are translated (lowered) to Wasm (GC) offline, and then that lowered code is expected to interoperate with other lowered code through lowered imports/exports that operate with lowered constructs. As I pointed out in a separate email thread a couple months back, frankly, this will never work. It will always allow programs to violate language semantics by misusing the lowered constructs in uncheckable ways that pass Wasm's type system but break the language's semantic guarantees. It will also be inherently inefficient as you will constantly introduce indirections to accomplish late binding, which is exactly what is happening here. And it won't help to add one-off constructs to Wasm (e.g. a nominal inheritance mechanism) to patch over this lack. No matter what constructs are proposed to add to the MVP, expecting lowered code to interoperate across the module boundary will never be as good as late binding (late lowering). It's just fundamentally more flexible.

Instead, we should be pursuing late binding. I will soon propose a way to encode languages into Wasm that doesn't require lowering too early. Then we need to make late lowering work efficiently through a proper API. The good news is that we need almost no changes to any of the proposals in the pipeline, except to enhance the type imports proposal and/or merge it with the module-linking proposal. In fact, it's completely orthogonal to any decisions we make here; you don't even need any Wasm GC proposal[1] (!?) to accomplish this: only reference types and type imports.

The high order bit of that approach is that if you want separate compilation and interoperability across a module boundary, then you need late binding with a runtime linker[2] that coordinates your language model and presents lowered code to the engine. The runtime linker is actually not too scary, and I am building some to validate this idea. If you don't need separate compilation, you can do fully offline (static) translation and directly present lowered code to the engine. This is the scenario for which the Wasm GC proposal was designed, but it's become increasingly clear to me that even though late binding is totally orthogonal, we cannot ship Wasm GC without having a story for late binding that retains current engines' compilation/caching model.

[1] Of course, you do need to eventually hit bottom for lowering. The Wasm GC proposal, whatever form it takes, will eventually be the final language into which higher-level languages are lowered. [2] The linker here is something more complex than a simple two-level name space and something less complex than a runtime compiler.

RossTate commented 4 years ago

I gave four different problems that four different languages on four different runtimes have with the current MVP that cause them to be better addressed by the simple proposal here. While I understand (and was aware of) your point about dynamic linking, that addresses only one of these problems. And my impression is that people do not want this proposal to become dependent on yet another proposal, or at least not one that sounds like quite a substantial undertaking, especially since there's growing concern that people will just start targeting linear-memory garbage collection if we can't get host-managed garbage collection out in reasonable time. So if our proposal makes WebAssembly a better target for these languages now, and you're concerned about competing with JavaScript now, then that's good reason to prefer it over the current MVP. And if you want WebAssembly to eventually be a better target than the JVM—where there are many challenges we will need to overcome to make that happen (such as dynamic linking), many of which we have barely begun to investigate in any real depth—then the simplicity of the current MVP makes it much more likely than the current MVP to be able to adapt to the surprises that will inevitably arise during that endeavor. (A concerning example I happen to know of off the top of my head is that no structural-based low-level type system (that I am aware of) has been able to guarantee that a JVM String[] only contains String values, even if you're willing to put aside the JVM's array covariance. They all have required every (non-Object) array access to perform a downcast.)

taralx commented 4 years ago

I'm not convinced that the MVP has to be better than compiling to JS. For example, WASI doesn't have a JS backend to support it, so you can't use that there, but you could use this there. There are complexities and inefficiencies to using JS objects from wasm.

I also would be surprised if the techniques used by JS engines to infer static typing despite dynamic semantics couldn't also be applied to this MVP. After all, most code will probably put the same underlying typed object into the same place in a structref...

kripken commented 4 years ago

WASI doesn't have a JS backend to support it

Maybe I'm misunderstanding the intention here, but you can compile WASI programs to JS using wasm2js + one of the JS implementations of the WASI APIs. And there is Cheerp which compiles to actual JS objects.

aardappel commented 4 years ago

I agree with @titzer that in the most general case of dynamic linking (where modules where compiled completely independently) no amount of Wasm features will be able to guarantee no semantics of any language can be violated. The discussion would be more productive if we would ignore these concerns entirely. A language basically has the choice to a) use very conservative representations at the boundaries (which I presume @titzer's late binding refers to), or b) use some form of language aware "linker" that can guarantee language semantics even with separate compilation (turning things into a closed world model at the latest possible moment).

But I'd agree with @RossTate that any remaining case of "this common operation in common language X always requires an expensive dynamic cast, and we have no known mechanism to optimize these away locally" should be treated extremely seriously, and would be good to compare against alternative proposals. And one of these competing systems remains "linear memory", because if a language can elide dynamic casts and range checks because it knows more about its types than a Wasm GC can know, then that is a serious performance advantage.

fgmccabe commented 4 years ago

Re Interior pointers. The fatness of the point is more than advertised @fgmccabe:

There needs to be both reference and scalar aspects

I'm not clear on where this would be necessary for typed interior pointers? A fat pointer to a reference type would use an index into the reference array instead of an index into the scalar array.

An interior pointer to a sub-record needs to know where the references and the scalars of the sub-record are located.

Horcrux7 commented 4 years ago

I does not understand the concept of tagref. Is a sample possible? Ideally with a Java like source language.

taralx commented 4 years ago

@Horcrux7 If you refer to the "Java/C#/Kotlin/Scala Object Layout" section above, all of the structrefs and funcrefs are actually packed into tagrefs before being stored in the object. This keeps the dynamic casts explicit.

Horcrux7 commented 4 years ago

The section "Java/C#/Kotlin/Scala Object Layout" described like my compiler for Java work. Every object (also arrays) include a class description and a hashcode. But this not required special support from the WASM runtime.

This keeps the dynamic casts explicit.

If I understand you than you want make the language type hierarchy available to the WASM runtime in the tagref? This would reduce language features to the features available in WASM.

taralx commented 4 years ago

@Horcrux7

But this not required special support from the WASM runtime.

Without support from the runtime, garbage collection is much more expensive to do. But any GC needs to trace references.

If I understand you than you want make the language type hierarchy available to the WASM runtime in the tagref?

Not necessarily. It depends on the needs of the compiler. tagref is necessary in order to be able to put multiple types of reference into the structref, but anything beyond that is up to you.

RossTate commented 4 years ago

If I understand you than you want make the language type hierarchy available to the WASM runtime in the tagref?

No, tagref is primarily expected to represent canonically tagged structref and funcref. The other cases are for imported types (which we do not assume are reference types) and "private" data (whose custom tag is not exported so that other modules cannot inspect or modify it).

If the language chooses to reify its type hierarchy, it would generally be expected to do so by encoding it in the same manner it would do in its own runtime. For example, an object (encoded as a structref) could have a reference to a vtable (encoded as a structref) which could contain an array of references to superclass/superinterface runtime representations (each encoded as a structref).

tlively commented 2 years ago

The proposal has now moved to phase 2 and consensus has coalesced around a different design, so I'll close this issue.

taralx commented 2 years ago

Very fair. I am considering if I want to fork this off into a "dynamic memory" proposal, with the purpose of allowing JS typed arrays to cross the JS-wasm boundary without copying.

tlively commented 2 years ago

@dtig's been looking at how to reduce copies across the boundary and iterating on a pre-proposal over at https://github.com/dtig/memory-control/blob/master/proposals/memory-control/Overview.md. I would chat with her if you're interested in that space.

taralx commented 2 years ago

Will do, thanks!

WebAssembly / gc