WebAssembly / component-model

Repository for design and specification of the Component Model
Other
899 stars 75 forks source link

Rename `float32`/`float64` back to `f32`/`f64` #277

Open esoterra opened 7 months ago

esoterra commented 7 months ago

As of preview2, the NaN-canonicalization rules have been relaxed and so in preview 2 the set of semantic values of float32 is the same the set for core wasm f32 (and likewise for float64/f64).

Since this was the main motivator for distinguishing between core wasm fXX and component model floatXX types, we should consider gradually renaming the component model floating point types to match core wasm again.

badeend commented 7 months ago

If I understand https://github.com/WebAssembly/component-model/pull/260 correctly, they are still not exactly the same. I.e. the Canonical ABI may still deliberately randomize NaN bit patterns when crossing component boundaries.

lukewagner commented 7 months ago

It's true that the process of lifting and lowering is allowed to mess with the NaN payload bits of floats, but that's independent of the set of semantically-possible float values that can cross component boundaries. (Before #260, the set-of-possible-values was smaller for float32/float64 than f32/f64 because there was only a single NaN value.) Thus, this change makes sense to me (it even feels like an omission from #260). The change would also remove the asymmetry between the <letter><bitwidth> naming scheme of the integral types and the <word><bitwidth> of the float types, which is nice.

If we did this, probably we would want to deploy the change incrementally (accepting both in WAT/WIT for a while so that everyone can gradually switch, and then deprecating the old names).

sunfishcode commented 7 months ago

I find it useful to think of the rules not as "the bits might be randomized" but instead as "component-model floating-point types only have a single NaN value", with the randomization just being an artifact of how they're encoded in core wasm.

lukewagner commented 7 months ago

I think that perspective is almost equivalent, but where I think it diverges is when lifting float values that are passed out to non-wasm (and thus not randomized during lowering), you are allowed to see multiple distinct NaN payloads -- if there was only a single semantic NaN value, this wouldn't be allowed.

sunfishcode commented 7 months ago

Independently of what the types are named, is it desirable to allow multiple distinct NaN payloads to be observed when passing values out to non-wasm? It would seem that any code depending on that functionality would be unvirtualizable.

lukewagner commented 7 months ago

The original motivation for the change was to avoid forcing an extra canonicalization pass when, e.g., passing a list<f32> out to a host library. The allowed randomization means you can't (or at least shouldn't) depend on seeing any particular NaN payload, although you also can't depend on seeing exactly one or a particular one either.

sunfishcode commented 6 months ago

With #279, the component-level float32 and float64 values are different from core-wasm f32 and f64, in that they only have a single NaN value. Implementations can omit canonicalization and randomization instructions when doing lifting and lowering together, but this is a permitted optimization rather than a property of the types.

I've also now posted https://github.com/WebAssembly/component-model/pull/288 to add more documentation.

lukewagner commented 6 months ago

Yep, with those semantic changes, I think you're right. Once we merge #288, I think we can close this issue, but I'll wait a while to collect more thoughts if anyone disagrees.

rossberg commented 6 months ago

Hm, I understand the motivation, but as is I find the naming discrepancy more confusing than helpful, because there is nothing in the name which actually conveys how floatN is different, and moreover, why the naming scheme deviates from that for ints. As is, it just looks like a seemingly random choice and all it does is create a mental impedance mismatch.

Feel free to ignore, but short of a more self-explanatory alternative, I think I'd prefer the short names, since the semantics still is close enough and the difference, where it matters at all, implied by context.

lukewagner commented 6 months ago

I definitely like the aesthetics of f32/f64 and their symmetry with the integral names, but I had thought that different sets of semantic values forced different names. But now that I say that, that reminds me that we do of course have component-level func and instance types that are wholly distinct from the core-level func and instance types and we distinguish them via core: prefix when necessary, so I suppose we could do that for f32/f64 as well.

sunfishcode commented 6 months ago

Using f32/f64, and just documenting the differences, sounds good to me too.

esoterra commented 6 months ago

I fully agree with @rossberg's explanation of the real main issue and @lukewagner's framing about the core: prefix and component/module value types already being separate.

If we're all on the same page now, I think the main question is when/how to do this. I imagine we'll want to roll it out gradually so that for a while both are acceptable but only f32/f64 are emitted to give people time to make the switch. Should we also wait in general until after the preview 2 release or would it be better to tackle this sooner than later?

sunfishcode commented 6 months ago

Regardless of when we do the full switch, we should start updating tools to at least recognize f32/f64 sooner rather than later. I submitted https://github.com/bytecodealliance/wasm-tools/pull/1356 to add f32 and f64 recognition in wit-parser.

lukewagner commented 6 months ago

Great, thanks for doing that. Even though this isn't a breaking binary format / runtime change, to avoid confusing/breaking folks, I suppose we should wait for that change to merge, percolate a few months, and then we can switch over the WASI proposals and Explainer.md/WIT.md in this repo.