whatwg / webidl

Web IDL Standard
https://webidl.spec.whatwg.org/
Other
390 stars 160 forks source link

Conversion from specification values to IDL values to ECMAScript values #674

Open annevk opened 5 years ago

annevk commented 5 years ago

Modern specification-prose will use https://infra.spec.whatwg.org/ data structures. E.g., an algorithm would return a list of JavaScript strings or a byte sequence.

However, IDL does not work that way. It only knows about ECMAScript and IDL values which are 1:1. Whereas specification constructs can fit in various IDL values. E.g., a byte sequence could become a JavaScript string via ByteString or a Uint8Array.

Specification-prose is generally also not specific about numeric types and will assume appropriate casting happens at the boundary.

I'm not really sure how to solve this, but this came up in #664 and it seems out-of-scope to address there.

yackermann commented 5 years ago

I am having the same concerns. Currently by Ecmascript you can convert anything to anything. For example when we try to enforce strict type checking. We would like to have some clarification on conversion from WebIDL to Ecmascript

travisleithead commented 5 years ago

I'm not sure I understand the specific problem. Is this about what conventions to use when writing specs?

I think this could be an interesting case of where the implementation via C++ vs. implementation via JavaScript comes up. In a C++ implementation of a feature, it helps to have the spec be clear about the type of data you are using, and it does seem like WebIDL is written for this implementation audience--in other words, that the IDL types are the data structures that would be used and manipulated in spec prose and algorithms. Does this happen in practice?

For an implementation authored in JavaScript itself, the conversions to IDL types are meaningless because there's no language boundary to cross, in terms of where the public API meets the internal implementation. The JavaScript langauge itself defines all the rules for type converting among it's different var representations (for example when concatenating two different types). Per my understanding, the Infra spec strongly implies that the core structures for spec development are based on JavaScript itself with some slightly more strongly typed conversions, such as the scalar value string which are available for specific scenarios. If implementing in JavaScript and trying to conform to WebIDL, the type-validation code for the various IDL type converstions would need to be coded explicitly, pretty much following the conversion algorithms defined in WebIDL for EcmaScript -> IDL.

travisleithead commented 5 years ago

Responding to a couple of points from @herrjemand's comment...

For example when we try to enforce strict type checking.

I assume you mean "more strict" than WebIDL currently handles--e.g., throwing an exception rather than performing a type conversion? In order to accomplish that, you'd need to basically use the any IDL type (essentially skipping all but the most straightforward pass-through type matching) and then handle all the logic to accept/reject that argument/setter in spec prose. In general, this is not recommended because it makes that API deviate from the expected conventions of the platform, unnecessarily increases the testing burden for the feature, and involves writing a lot more spec prose!

We would like to have some clarification on conversion from WebIDL to Ecmascript.

I'm not sure this issue is about the lack-of-clarity in Ecmascript-to-WebIDL (and vice-versa) conversions, which are quite precisely defined in the WebIDL spec (and rather exhaustively tested in the various reflection tests for HTML -- see the approximately 40,000 tests at https://wpt.fyi/results/html/dom). As @annevk points out above, ECMAScript and IDL values are 1:1.

For all the precise details, i.e., step-by-step conversions from all Ecmascript types to IDL types and vice-versa, see: https://heycam.github.io/webidl/#es-type-mapping

bzbarsky commented 5 years ago

ECMAScript and IDL values are 1:1

This is not quite true. You could have two different IDL values that end up producing the same ES value (most simply two different numeric types, though you can also get this with a dictionary and a record, say). And conversely, you could obviously have quite different ES values that produce the same IDL value given an IDL type to convert to.

What is true (or should be) is that once you know what IDL type you are dealing with then (1) the set of IDL values for that type is well-defined, (2) how to convert those values to ES values is well-defined, and (3) how to convert ES values to IDL values for that type is well-defined.

As far as spec prose goes, does that help at all? That is, can we define conversions as needed from infra values to IDL values as long as the desired type is known? We already have that for the sequence<T> IDL type. https://heycam.github.io/webidl/#idl-sequence last sentence says:

Any list can be implicitly treated as a sequence, as long as it contains only items that are of type T.

That won't help if your IDL type is a union and your infra value could conceivably map to several different types in the union, though. Back to @annevk's example, if you have a byte sequence and the IDL type is (ByteString or Uint8Array) then nothing clearly defines how that should get handled...

annevk commented 5 years ago

I think we can define conversions as long as the desired type is known.

If we can outlaw such unions I think that'd be the right thing to do.

bzbarsky commented 5 years ago

The problem is that as an argument type such a union might make a lot of sense.

annevk commented 5 years ago

Maybe, although for integers and BigInt the current idea is to not allow overloading.

I guess though that we can allow it and when it's ambiguous your specification algorithm will need to explicitly convert to the appropriate IDL type first (or explicitly convert to the relevant Infra type, e.g., if you want to turn a list into a set).