WebAssembly / interface-types

Other
642 stars 57 forks source link

Add integers to explainer #64

Closed lukewagner closed 5 years ago

lukewagner commented 5 years ago

A good next step would be to add a(n) interface type(s) to represent integers.

The basic requirements I'm aware of are:

  1. The values of the integer interface type(s) should be explicitly-signed integers, unlike core wasm's sign-less i32 and i64.
  2. The integral lifting instructions show allow core wasm i32/i64 values to be converted into integers in the full signed and unsigned 32-/64-bit ranges, resp.
  3. When an integral interface value is converted to JS, it should be possible to statically determine (based on the type, not the value) whether to produce a Number or a BigInt (these types being non-interoperable).
  4. The integer interface type(s) should be usable as the element/field types of sequences/records/variants. In particular, sequences of integers need an explicit sign+width, so that they can produce typed arrays with the desired element type when passed to Web IDL methods or JS.

Additionally, I don't know if it's a hard requirement, but these would be nice to have:

  1. There are not many integral adapter instructions, and they only do simple things like truncation or sign extension.
  2. An API provider can reflect simple range requirements stemming from the i32/i64 representation type in the static type, not documentation.
  3. An API provider should be able to increase the acceptable range of integer values over time (in a covariant position) without it being a backwards-incompatible breaking API change.

My inclination would be to start with fixed-width integral interface types u8/s8/u16/s16/u32/s32/u64/s64, with possible future extension to an unbounded integer type (as motivated by need). In addition to obviously addressing (1), (2) and (6), I think the other requirements/nice-to-haves can be addressed/mitigated with:

An alternative would be to start with only integer, but I worry that that misses requirements (3) and (4) and nice-to-have (6).

Thoughts?

fgmccabe commented 5 years ago

If you proliferate the interface types with all those integer types you will have a much larger set of lifting and lowering operators. Essentially, IMO, you do not need it. In fact, the interface types should be thought of as something analogous to a serialization format: you need a way of representing any integer without fixing on a particular one.

On Tue, Sep 3, 2019 at 9:15 PM Luke Wagner notifications@github.com wrote:

A good next step would be to add a(n) interface type(s) to represent integers.

The basic requirements I'm aware of are:

  1. The values of the integer interface type(s) should be explicitly-signed integers, unlike core wasm's sign-less i32 and i64.
  2. The integral lifting instructions show allow core wasm i32/i64 values to be converted into integers in the full signed and unsigned 32-/64-bit ranges, resp.
  3. When an integral interface value is converted to JS, it should be possible to statically determine (based on the type, not the value) whether to produce a Number or a BigInt (these types being non-interoperable).
  4. The integer interface type(s) should be usable as the element/field types of sequences/records/variants. In particular, sequences of integers need an explicit sign+width, so that they can produce typed arrays https://heycam.github.io/webidl/#dfn-typed-array-type with the desired element type when passed to Web IDL methods or JS.

Additionally, I don't know if it's a hard requirement, but these would be nice to have:

  1. There are not many integral adapter instructions, and they only do simple things like truncation or sign extension.
  2. An API provider can reflect simple range requirements stemming from the i32/i64 representation type in the static type, not documentation.
  3. An API provider should be able to increase the acceptable range of integer values over time (in a covariant position) without it being a backwards-incompatible breaking API change.

My inclination would be to start with fixed-width integral interface types u8/s8/u16/s16/u32/s32/u64/s64, with possible future extension to an unbounded integer type (as motivated by need). In addition to obviously addressing (1), (2) and (6), I think the other requirements/nice-to-haves can be addressed/mitigated with:

  • For (3), all <=32-bit fixed-width types produce numbers, the 64-bit fixed-width types and the future integer type produce BigInt (symmetric to BigInt integration https://github.com/WebAssembly/JS-BigInt-integration)
  • For (4), the fixed-width types line up 1:1 with typed array types, with the exception of Uint8ClampedArray, but I think this type is only useful for JS working directly with arrays, not calling APIs.
  • For (5), I think we can use a small set of parameterized https://webassembly.github.io/spec/core/syntax/instructions.html#parametric-instructions instructions that only allow truncation during lifting and sign-extension (implied by the interface type) during lowering.
  • For (7), I think we can define a coercive subtyping relation between integer types (just based on set inclusion) so that the caller can import and call a function with a bigger width, meaning that increasing integer width is not a breaking change.

An alternative would be to start with only integer, but I worry that that misses requirements (3) and (4) and nice-to-have (6).

Thoughts?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/interface-types/issues/64?email_source=notifications&email_token=AAQAXUFLDRGRXCMNTEI2NTLQH4Y6ZA5CNFSM4ITNFF52YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HJEPRHA, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUCXHUMJN27DBHWVOVLQH4Y6ZANCNFSM4ITNFF5Q .

-- Francis McCabe SWE

lukewagner commented 5 years ago

@fgmccabe As I stated above, I don't think it needs to be a much larger set of instructions. Also, what about all the other requirements listed in the post?

jgravelle-google commented 5 years ago

I don't agree with all your requirements.

  1. The values of the integer interface type(s) should be explicitly-signed integers, unlike core wasm's sign-less i32 and i64.

Why? I think you're saying that signedness is a property of an interface, and we should be able to reflect that in the signature and therefore we need u32 and s32, but "we need signs" isn't a core requirement.

  1. The integral lifting instructions show allow core wasm i32/i64 values to be converted into integers in the full signed and unsigned 32-/64-bit ranges, resp.

This one's a good hard requirement. "We must represent all the values," agree.

  1. When an integral interface value is converted to JS, it should be possible to statically determine (based on the type, not the value) whether to produce a Number or a BigInt (these types being non-interoperable).

Agree, this should be statically determinable. I believe @fgmccabe and I are thinking that it need not be determinable by type alone, because we also have the information contained in the adapter instructions. (In particular, we can satisfy this with integer by looking at whether we convert via i32-to-integer+integer-to-i32 or i64-to-integer+integer-to-i64 pairs)

  1. The integer interface type(s) should be usable as the element/field types of sequences/records/variants. In particular, sequences of integers need an explicit sign+width, so that they can produce typed arrays with the desired element type when passed to Web IDL methods or JS.

Usable is a requirement, yes. I don't think there's anything fundamental about sign+width for sequence/record/variant in general though. I think it's fair to say that we should have some way to generate typed JS arrays, but I don't agree that the only way to do that is to bake the sign+width into the integer type. Another design (that isn't great, mind) is to have explicit special sequence types that map directly to JS typed arrays. So we'd have seq<int> + u8seq + s8seq, rather than seq<u8> + seq<s8>. My general point here is that having a convenient mapping from sequences of integers to JS typed arrays is a nice-to-have, not a hard requirement.

  1. An API provider can reflect simple range requirements stemming from the i32/i64 representation type in the static type, not documentation.

Why externalize this implementation detail? Why only externalize the module's internal (supposedly hidden) data format, but not other constraints on the domain of inputs? If only integers < 100 are allowed, why do we not express that statically and fall back on documentation? If expressing the domain is a useful property for semantic correctness, we should have a more general mechanism. If it's a useful property because it expresses reality and thus can be performant, then say that. I'm asking "why" because there is an underlying reason that has your intuition say this, and I believe that that is the real requirement.

5) and 7) I just agree.


So I think your analysis is generally right, but we disagree on how necessary certain aspects of the design are. More usefully I think we can order these based on importance, with a "hard requirements line". My stab at it:

Required:

  1. Can represent all wasm integer values
  2. Can use integers everywhere we use other types (sequence/record/variant)

Nice to have:

  1. Can statically generate correct adapters (Number or BigInt)
  2. API providers can make changes that are backward-compatible
  3. Have an efficient conversion to typed JS arraybuffers
  4. Can express domain on which a function is defined (values + signedness)
  5. Minimize the number of adapter instructions + semantics

Anyway I think we should make sure we're on the same page w/r/t requirements before discussing solutions in too much detail.

fitzgen commented 5 years ago

For (4), the fixed-width types line up 1:1 with typed array types, with the exception of Uint8ClampedArray, but I think this type is only useful for JS working directly with arrays, not calling APIs.

Quick note: the ImageData constructor requires a clamped array, which is a pretty important Web API that I think we want to call from Wasm.

lukewagner commented 5 years ago

@fitzgen Ah, thanks for pointing that out, I'll update that bit. In almost all the cases I could find, APIs that accepted Uint8ClampedArray also accepted Uint8Array, but I see you're right that the ImageData ctor (strangely) only accepts Uint8ClampedArray. I think an ok solution here is that, since we have the power to define the "Web IDL - WebAssembly Binding" that determines how wasm values get turned into Web IDL values, we can just specify that a wasm u8 list value may be converted to a Uint8ClampedArray Web IDL value.

jgravelle-google commented 5 years ago

Talked with @lukewagner offline and I changed my mind to thinking for practical reasons we should specify size+width. A small subtle distinction is that it makes the most sense to me to describe width as an arbitrary number spec-wise (e.g., to be able to say things like s7 or u10), while only allowing (possibly temporarily) a reasonable subset of values: (u|s)(8|16|32|64)

This essentially covers all the criteria to me. Conceptually there's one int type, with bitwidth and signedness as parameters. If we want to generalize to specifying arbitrary ranges, we can with some spec twiddling (observing that s16 is just the values [-32768, 32767], and u8 is [0,255], so we can convert between the ways we represent them, and/or have predefined aliases)

lukewagner commented 5 years ago

Alright, resolved by #66, at least for the initial explainer stab at it.