Experiments with Record design

WebAssembly / interface-types

Other

641 stars 57 forks source link

Experiments with Record design #61

Open jgravelle-google opened 4 years ago

jgravelle-google commented 4 years ago

I've been experimenting with a design for structs/records in my polyfill of interface types (viewable here). This is all open for discussion, but here's a thing I tried and some of the stuff I ran in to.

Declaring a record type looks like:

(@interface type $Foo record
  (field $bar string)
  (field $baz int)
)

This declares a record type Foo with fields bar and baz. Note that the fields are represented as indices in instructions to follow, but in the declaration the names are preserved in the binary. This is so languages like JS, Python, Lua, etc. can have reasonable string names in their code, e.g. (JS):

instance.exports.readFoo({bar: "hello", baz: 12});

There's two main instructions needed to interact with record objects in an interface adapter, creation and destructuring. My version has make-record and get-field

get-field is straightforward, given a record and a field, get that field off the record. So get-field $Foo $baz pops a Foo and pushes a string. get-field takes two immediates, the type index, and the field index.

make-record is straightforward too, but raises some interesting questions. make-record $Comment pops a string and an int, and pushes a Foo. In general make-record takes the type index as an immediate, and has one stack argument per field.

Where it gets interesting is the question: what arguments do we pass to make-record? Let's say we have a corresponding C struct:

struct Foo {
  char* bar;
  int bar_len;
  int baz;
};

and a function

void readFoo(struct Foo foo);

how does that foo argument translate to C's ABI? One reasonable way is to destructure the struct into its components, and pass those all as arguments individually. Another reasonable way is to stack-allocate the argument in the caller's frame, and pass the pointer in (this is what Clang does, and I think is a standard C ABI thing).

If we destructure, the adapter is just re-structuring those arguments back into a record. If we pass by pointer however, we now need some way to read fields off that pointer. What I'm doing for the time being is defining an exported getter function for each field in the C struct. This functions, but can almost-certainly be improved. I'm not sure how to improve it without respecifying load+store instructions in interface adapters. We would also need to do similar for gc objects. The nice thing about call-export is it lets us defer reimplementing anything expressible with wasm instructions. The downsides are that it requires a specific kind of toolchain integration to generate those exports (not really too bad), and it relies on engine inlining to not be inefficient.

So that's the general sketch of a design I've been working with. Thoughts?

fgmccabe commented 4 years ago

If I understand correctly, you are 'pushing' both the field names and values on the intermediate stack. I would suggest a different approach, think of the signature of the record as a recipe for decoding the stack. With the recipe 'in hand' you pop elements off the stack as dictated by the signature.

On Wed, Aug 28, 2019 at 1:26 PM Jacob Gravelle notifications@github.com wrote:

I've been experimenting with a design for structs/records in my polyfill of interface types (viewable here https://github.com/jgravelle-google/wasm-webidl-polyfill/tree/794070f109c38b7c7cb99e1ca2dbcf82031f1476/record). This is all open for discussion, but here's a thing I tried and some of the stuff I ran in to.

Declaring a record type looks like:

(@interface type $Foo record (field $bar string) (field $baz int) )

This declares a record type Foo with fields bar and baz. Note that the fields are represented as indices in instructions to follow, but in the declaration the names are preserved in the binary. This is so languages like JS, Python, Lua, etc. can have reasonable string names in their code, e.g. (JS):

instance.exports.readFoo({bar: "hello", baz: 12});

There's two main instructions needed to interact with record objects in an interface adapter, creation and destructuring. My version has make-record and get-field

get-field is straightforward, given a record and a field, get that field off the record. So get-field $Foo $baz pops a Foo and pushes a string. get-field takes two immediates, the type index, and the field index.

make-record is straightforward too, but raises some interesting questions. make-record $Comment pops a string and an int, and pushes a Foo. In general make-record takes the type index as an immediate, and has one stack argument per field.

Where it gets interesting is the question: what arguments do we pass to make-record? Let's say we have a corresponding C struct:

struct Foo { char* bar; int bar_len; int baz; };

and a function

void readFoo(struct Foo foo);

how does that foo argument translate to C's ABI? One reasonable way is to destructure the struct into its components, and pass those all as arguments individually. Another reasonable way is to stack-allocate the argument in the caller's frame, and pass the pointer in (this is what Clang does, and I think is a standard C ABI thing).

If we destructure, the adapter is just re-structuring those arguments back into a record. If we pass by pointer however, we now need some way to read fields off that pointer. What I'm doing for the time being is defining an exported getter function for each field in the C struct. This functions, but can almost-certainly be improved. I'm not sure how to improve it without respecifying load+store instructions in interface adapters. We would also need to do similar for gc objects. The nice thing about call-export is it lets us defer reimplementing anything expressible with wasm instructions. The downsides are that it requires a specific kind of toolchain integration to generate those exports (not really too bad), and it relies on engine inlining to not be inefficient.

So that's the general sketch of a design I've been working with. Thoughts?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/interface-types/issues/61?email_source=notifications&email_token=AAQAXUDFFLR5ZN2UMQEGKBTQG3NRDA5CNFSM4IRSL5BKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HIAZVJQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUA32VVY7UPTW5FSO6DQG3NRDANCNFSM4IRSL5BA .

-- Francis McCabe SWE

jgravelle-google commented 4 years ago

With the recipe 'in hand' you pop elements off the stack as dictated by the signature.

Yes, exactly. Maybe an example makes that clearer (other than the example linked in the polyfill)

(@interface func $printFoo (import "" "printFoo")
  (param $Foo)
)
(@interface adapt (import "" "printFoo")
  (param $ptr i32)

  ;; read bar string
  arg.get $ptr
  call-export "get_bar"
  arg.get $ptr
  call-export "get_bar_len"
  read-utf8

  ;; read baz int
  arg.get $ptr
  call-export "get_baz"
  as int

  ;; pops string and int, pushes Foo
  make-record $Foo ;; $Foo refers to a type index

  call $printFoo
)

So, all of the record data is implied by the initial declaration, and all following uses just refer to a type index. Ditto for get-field, though it also needs a field index (but has enough information to just need an index)

pchickey commented 4 years ago

I'd expect that for C and Rust the module type will pass the record as a pointer to struct just like you describe. Using export functions to convert from pointer to value isn't ideal as the only way to do it - the common case would be to loading an offset (provided as literal) from the pointer in linear memory. Following the convention of memory-to-string, maybe that op is memory-to-{i32, i64, f32, f64} <memory name> <offset> <ptr>?

alexcrichton commented 4 years ago

I think it may be inevitable that we add enough adapter instructions to handle indirectly passed arguments and/or return pointers for large aggregates, but I also think it's reasonable to have a world with perhaps a special ABI (maybe not the default, but one that had to be explicitly named) which changed the ABI of how aggregates were passed and made them all "splat" to be expanded inline as arguments and passed via multi-value as return values.

For example something like this:

#[repr(C)]
struct A {
    a: i32,
    b: f32,
}

extern "C" {
    fn roundtrip1(a: A) -> A;
}

extern "wasi" {
    fn roundtrip2(a: A) -> A;
}

would generate an import where roundtrip1 would take a return pointer for the return value and also take a pointer for the argument. The roundtrip2 API would take an i32 and f32 parameter and return an i32 and f32 result, the components of each struct.

All that to say that I think should probably strive to get by without memory-related instructions in the first pass of stabilization for interface types, and I think it's possible to do from C/Rust as well (assuming LLVM gets enough support for multi-value of course). In the long run though I could see memory instructions coming into existence perhaps, but they seems sort of solely motivated at this time by a lack of support in LLVM and a lack of specificity around the ABI implemented in LLVM currently.

pchickey commented 4 years ago

The "splat" ABI you describe is a great idea! I was talking to @fitzgen about custom Rust ABIs last week and he had some great ideas about how to implement them using procedural macros.

All that to say that I think should probably strive to get by without memory-related instructions in the first pass of stabilization for interface types, and I think it's possible to do from C/Rust as well (assuming LLVM gets enough support for multi-value of course)

We may be able to use the trick above to avoid memory-related instructions in records, but we may need them for sequences. I don't want to make that an excuse to use them here if another solution will do, but I don't have a strong aversion to including them. Can you help me understand why we should avoid them?

tlively commented 4 years ago

Note that passing large structs as multivalue returns will not work for self-referential structs and may have negative performance consequences (although that is just speculation until engines more widely support multivalue).

I do expect that we will introduce a new C ABI for returning small structs directly on the stack once we have implemented multivalue, but I don't expect that we'll want to do that for structs of more than a few fields.

alexcrichton commented 4 years ago

@pchickey

Can you help me understand why we should avoid them?

Oh sure yeah, let me clarify. If the only motivation for the memory instructions is that LLVM can't do multi-value or the splat-ABI, that doesn't seem to me like it's wortwhile to add the memory instructions. If, however, there's other use cases motivating the memory instructions, that seems totally reasonable to me!

@tlively

Note that passing large structs as multivalue returns will not work for self-referential structs and may have negative performance consequences

Is it legal in C/C++ to pass a self-referential struct by-value? In Rust at least if you return a whole struct it's changing the address of the struct's storage so it can't be self-referential. That also seems like it's somewhat of a niche use case which may not be a killer motivator for the memory instructions?

As for the perf consequences, it'd be interesting to test out and evaluate! I"m also not sure myself what it would be.

jayphelps commented 4 years ago

Lurking and wanted to chime in that I've been experimenting with "record as multi-value return or params" in a bespoke language and discovered the current limits set by Chrome, currently 1000 params/return values. 1000 * i64 = 7.75 kilobytes Just wanted to provide an FYI to anyone reading this, though I imagine many of you were already aware of these limits.

Not saying whether the current limits are good or bad, other that all things being equal higher limits are better so larger records could be passed without touching linear memory. I'm not sure how params and multi-value return are actually implemented (could guess) so not sure the pros and cons of higher limits.

Haven't checked other VMs. Although they could increase that limit, seems binaries would run into backwards compatibility trouble and would need to ship two binaries, choosing which depending on a runtime check at startup with WebAssembly.validate(wasmThatHasMoreThan1000ParamsAndReturns) but then again multi-value isn't supported without a flag yet anywhere AFAIK so presumably if the limits increased at the same time, you'd combine the checks.

tlively commented 4 years ago

@alexcrichton, good point, the passed copy of the self-referential struct would definitely still point to the source copy. This does make me wonder what happens to pointers generally in this scheme, though. I suppose the adapter would have to know how to lift the pointers to some more generic reference type.

fitzgen commented 4 years ago

@jgravelle-google, thanks for writing this up.

I'd like to reiterate my sentiment from our last video meeting:

that I am in favor of this idea in general and think we should pursue it further,
and that we should hold off on any sort of memory-to-record operator that is analagous to memory-to-string (or read-utf8 or whatever we end up calling it) for now. memory-to-record involves blessing a particular in-memory representation of a record, which I don't think has as simple an answer as strings. Let's punt on it for now and see if plucking fields one by one is actually that much slower than a memcpy in practice and if engine's can't optimize it away.

@jgravelle-google:

how does that foo argument translate to C's ABI? One reasonable way is to destructure the struct into its components, and pass those all as arguments individually. Another reasonable way is to stack-allocate the argument in the caller's frame, and pass the pointer in (this is what Clang does, and I think is a standard C ABI thing).

I think this is mostly a concern of the C toolchain that is producing the Wasm and its interface types section, and not a concern for our standardization efforts.

It is the toolchain's responsibility to make available the values that will be used in the adapter functions. This doesn't necessarily mean calling exports (and the out-of-line call overhead that implies), nor does it necessarily mean that we must introduce memory-related adapter instructions. The toolchain can use a custom ABI (the "destructuring" approach) for functions that get wrapped in an adapter, instead of LLVM's default C+Wasm ABI.

I am very cautious about making standardization decisions based on how LLVM currently maps the C Abi onto Wasm. Toolchains should implement standards, not the other way around. Of course, if we standardize something that is unimplementable (or unimplementable in a performant way) then we must address that and fix the standard, but we shouldn't start with reversing the way things happen to be right now.

All this is to say that I 100% agree with what @alexcrichton says here:

If the only motivation for the memory instructions is that LLVM can't do multi-value or the splat-ABI, that doesn't seem to me like it's wortwhile to add the memory instructions. If, however, there's other use cases motivating the memory instructions, that seems totally reasonable to me!

Backing up a bit, we clearly want to support some subset of Wasm instructions in adapter functions. Maybe it is only calling exports, maybe it also includes loads and stores. I hope it doesn't include loop ... end :-p

Instead of defining a bunch of instructions that are "like instr.foo in Wasm, but in our adapter language, with this potentially different binary encoding", what if we allowed wasm instr* end blocks in adapter functions? We would enumerate the subset of Wasm instructions that are valid within these blocks, allow them to take and return Wasm values from the top of the adapter's heterogeneous stack, define how to create their validation context and runtime structure, and then we get to reuse all the existing semantics and binary structure from Wasm itself.

jgravelle-google commented 4 years ago

and that we should hold off on any sort of memory-to-record operator that is analagous to memory-to-string (or read-utf8 or whatever we end up calling it) for now

Strongly agree. My best argument against is that the complexity of specifying struct packing or alignment is probably not worth the benefit.

I think this is mostly a concern of the C toolchain that is producing the Wasm and its interface types section, and not a concern for our standardization efforts. It is the toolchain's responsibility to make available the values that will be used in the adapter functions.

It is also the standard's responsibility to map well to common toolchain designs. I use what-Clang-does-today as well as Rust to demonstrate points in the design space, which is where we might want to be flexible. It's easy for me to imagine a non-LLVM toolchain that uses either ABI.

In particular I want to minimize the effort needed to implement this across the ecosystem. The more flexible our primitives, the easier it will be to get an existing compiler to use them without invasive changes.

I am very cautious about making standardization decisions based on how LLVM currently maps the C Abi onto Wasm. Toolchains should implement standards, not the other way around.

I want to dig in to this more because it covers a broad point. I keep referencing specific things like what LLVM does, how C works, and how I threw together an experiment in my polyfill, not because I think those are the best way, but because they are very specific examples I can point to directly. One of the most challenging things about having productive discussions about Interface Types is that it is at a very high level of abstraction, and that makes it hard for different people to agree on mental models. I've been trying to approach things with a focus on specifics, though that seems to come across as being focused on the state of LLVM's current limitations. I can try caveatting that more up-front I think, or paint a better picture of how things might generalize from there (or why I think an LLVM-specific thing is more general than that).

Toolchains should implement standards, not the other way around.

I don't actually disagree with this, but it feels like an axiom and that makes me skeptical. An alternate view would be that standards should capture the essence of the best solution for the problem, which may be already present in existing tools, and so we should understand why they do what they do. ... But that's probably unproductively distracting :D

If, however, there's other use cases motivating the memory instructions, that seems totally reasonable to me!

Strong agree. My intuition says there will be, but I think it will depend on how we want to handle sequences, or other data structures. In general I think it makes sense to have memory instructions iff it is the case that some modules represent data we need to reason about in memory. And in general I think they do.

what if we allowed wasm instr* end blocks in adapter functions? We would enumerate the subset of Wasm instructions that are valid within these blocks, allow them to take and return Wasm values from the top of the adapter's heterogeneous stack, define how to create their validation context and runtime structure, and then we get to reuse all the existing semantics and binary structure from Wasm itself.

I dig that. That feels like the right mechanism to just import a bunch of wasm semantics wholesale. Can do things like wasm i32.const 0 end instead of needing to re-spec all the consts. It's probably not any easier to implement, but spec-wise and concept-wise it's simpler.

That feels like it should be a separate issue to get more visibility and focused discussion.