WebAssembly / interface-types

Other
641 stars 57 forks source link

Update `string.(lift|lower)_memory` to read `u32` + `string.size` #102

Closed Hywan closed 4 years ago

Hywan commented 4 years ago

Initially, string.lift_memory reads a pair of i32 to represent the pointer/the base, and the length of the string. string.lower_memory also reads a i32 to represent the pointer/the base. Finally, string.size returns a i32.

This patch proposes to change i32 by u32. Indeed, a pointer/a base cannot be negative, so it can be represented by u32. The length of the string can also not be negative, so it is by definition a u32.

In addition, using i32 restricts ourself to a memory of 2Gib, whilst u32 allows to address a memory of 4Gib.

Thoughts?

(Originates from https://github.com/wasmerio/wasmer/pull/1329 and https://github.com/wasmerio/wasmer/pull/1337)

eqrion commented 4 years ago

Technically, i32 is a core wasm type and so the interpretation of the bits (i.e signedness) is determined by the instruction that consumes it. This is the case for the core wasm instructions, like i32.load which interpret the i32 as an unsigned integer. The proposed string.lift/lower_memory instructions follow that precedent and interpret the i32 as an unsigned offset into linear memory.

This does raise an interesting question though of when using an interface integer type is more appropriate than a core integer type. The fact that both are available in interface-types raises a good amount of confusion.

I would lean towards using the core i32 type here to match how core wasm instructions specify linear memory offsets. In the future with wasm64, there may be a new abstraction for whether the offset is an i32 or i64, and I can imagine we would want to be compatible with that.

Hywan commented 4 years ago

Thank you for the detailed answer.

My understanding is that it depends of the intent. Do we want to express the semantics of the instruction, or the “implementation”/low-level representation of the instruction (as Wasm core would do)?

My personal opinion is that the document should reflect the semantics of the instructions, and it's up to the implementor to find the best strategy. Because, since WIT allows to express more types, why the document should restrict itself to wasm-core types only?

lukewagner commented 4 years ago

Lifting inherently converts from core wasm types to interface types, so having string.lift/lower take/produce a u32 would require an additional otherwise-unnecessary instruction to lift/lower the core i32 which is ultimately flowed from/to core wasm.

Hywan commented 4 years ago

Thanks :-)