Data guidelines for Web IDL iterable, setlike and maplike declarations

foolip commented 4 years ago

Web IDL has some higher-level declarations that result in multiple things being exposed on an interface, and it would be great to have some data guidelines for how to represent these.

They are:

iterable<value_type> and iterable<key_type, value_type> result in @@iterator, entries, keys, values and forEach being exposed
async iterable<value_type> and async iterable<key_type, value_type> result in @@asyncIterator and some other things being exposed.
maplike<key_type, value_type> results in @@iterator, forEach, size, entries, keys, values, get, has, clear, delete and set being exposed
setlike<type> results in @@iterator, forEach, size, entries, keys, values, has, clear, delete and set being exposed

Apologies for inevitable mistakes in the above list.

There's already some inconsistency in how such things are represented in BCD. Here, as a "Support of for...of" entry:

https://github.com/mdn/browser-compat-data/blob/1ca3bd9afdea60d0cbeb6f270274ce3d492824d0/api/FormData.json#L102-L104

And here, as the @@iterator symbol and other entries:

https://github.com/mdn/browser-compat-data/blob/1ca3bd9afdea60d0cbeb6f270274ce3d492824d0/api/StylePropertyMapReadOnly.json#L434-L436

What should we do?

Opinion: As a reader of the StylePropertyMapReadOnly compat table it's not super useful to know that @@iterator is individually supported, it's more useful to know whether for (const [prop, val] of stylePropertyMap) { ... } will work. Knowing that entries could be more useful, however. Perhaps there should be a grouping of entries to reduce clutter?

ddbeck commented 4 years ago

This seems somewhat related to the mixins problem (#472). Like mixins, I think we ought to show the complete API, not the abstraction.

IDL is a nice notation for specifying APIs, but it's the actual names an API exposes that web developers want to know about. If some iterable implements an entries() method, entries() should show up in our data set, even if the IDL doesn't say so directly. Similarly, I agree that developers care about for … of, not the @@iterator method as such.

That said, the name @@iterator does seem like a convenient key to use for the feature, though we'd probably want a human-readable description, such as "<code>for … of</code> support (<code>@@iterator</code> method). We have plenty of precedents for requiring a specific description for a given feature name.

If we were to do this, we'd need to add the data guideline and open an issue to bring existing features into compliance.

foolip commented 4 years ago

I agree it's similar to mixins in the sense that a single thing in Web IDL can expand to multiple things for web developers.

Listing all of these things separately ensures we're not losing information, but https://developer.mozilla.org/en-US/docs/Web/API/StylePropertyMapReadOnly#Browser_compatibility really is quite noisy, it's hard to spot which things aren't there for iteration. Also notable is that you can't easily tell what kinds of values you'd iterate.

I don't have a concrete proposal here, but is there some kind of grouping of entries and/or collapsing of the presentation that would make this more usable to readers?

ddbeck commented 4 years ago

Yeah, it would be good to group the data. One option might be to use that interface as a feature itself, in a structure like this:

StylePropertyMapReadOnly
- get
- has
- iterable
- @@iterator
- entries
- forEach
- …
- …

This approach presents some complications. We don't have a lot of abstract features like iterable, outside of CSS, so we'd have to make some decisions about whether iterable is supported only if all the subfeatures are supported and so on. We'd also have to figure out how to communicate that abstract quality (e.g., a description like "Implements the iteration protocol" or something).

Another approach might be to prefix such names (e.g., iterable_@@iterator, iterable_entries). This is less machine-useful, but gets us a useful sort for tables. And it doesn't add depth to the feature structure where no such depth exists in the features themselves (that is StylePropertyMapReadOnly.has and StylePropertyMapReadOnly.entries are the same depth in the BCD hierarchy as they are in their actual namespace).

Maybe there are other options—I'm not sure.

foolip commented 4 years ago

We don't have a lot of abstract features like iterable, outside of CSS

It is actually somewhat widespread by now, it just isn't in BCD in most cases. Here are all the cases I found via reffy-reports:

https://drafts.css-houdini.org/css-typed-om/#the-stylepropertymap (the example we've been using here) https://drafts.css-houdini.org/css-typed-om/#unparsedvalue-objects https://drafts.css-houdini.org/css-typed-om/#cssnumericarray https://drafts.css-houdini.org/css-typed-om/#transformvalue-objects https://dom.spec.whatwg.org/#interface-nodelist https://dom.spec.whatwg.org/#interface-domtokenlist https://fetch.spec.whatwg.org/#headers-class https://immersive-web.github.io/webxr/#xrinputsourcearray-interface https://streams.spec.whatwg.org/#rs-class-definition https://url.spec.whatwg.org/#interface-urlsearchparams https://xhr.spec.whatwg.org/#interface-formdata https://w3c.github.io/encrypted-media/#mediakeystatusmap-interface https://wicg.github.io/local-font-access/#fontiterator https://wicg.github.io/native-file-system/#api-filesystemdirectoryhandle

Either the grouping or the prefixing approach seems OK to me in the data model. Another option is an additional property that says that entry is part of the iterable behavior, either as a very specific exists_because_iterable_in_spec_idl_definition="true" or some more generic group="iterable".

ddbeck commented 4 years ago

It is actually somewhat widespread by now, it just isn't in BCD in most cases.

Right! To clarify, I meant that we don't have many of the abstract features in BCD. Abstract features are definitely widespread, especially if we count mixins and CSS types, and the like.

The list of applicable cases is really helpful. Thank you!

Mapping to IDL in the schema more explicitly is an interesting idea too. Hmm. Gotta think on this some more.

(If anyone else is consuming BCD and lurking on this issue, I'd like to hear your take on these suggestions!)

Elchi3 commented 4 years ago

I think in the case of iterables, we can define an abstract feature iterable in BCD (and have it mapped to the IDL concept of iterable). It should have a standard description and we need to put it in the data guidelines.

MDN docs will have to mention that a class is iterable and also define what it means in more details. This isn't the job of the compat table. Also, I think the compat story of all iterable features (for..of, entries, values, @@iterator, ...) will be the same, so the break down into the specific features is not needed and only means more data to review/maintain. If a browser doesn't implement a sub aspect of the iterable concept, we can mark the abstract BCD feature iterable as partial_implementation: true and add a note that mentions what isn't implemented (or we could then breakout into the sub aspects of iterable if really needed).

So far I don't think data consumers have requested that we list all methods of a class specifically and if we're consistent with an abstract feature iterable, a data consumer can derive the concrete class members' compatibility.

foolip commented 4 years ago

Also, I think the compat story of all iterable features (for..of, entries, values, @@iterator, ...) will be the same, so the break down into the specific features is not needed and only means more data to review/maintain.

I tried to find a counterexample to this, where something implied by iterable<T> in Web IDL was already present on an interface before it was converted to iterable<T> in implementation. However, much to my surprise I didn't find such a case while hunting around Chromium's source code and the history for some of the older interfaces listed in https://github.com/mdn/browser-compat-data/issues/6367#issuecomment-661044392.

Such cases might exist, but judging by this search it's not going to be something we run into a lot. Dealing with exceptions using partial_implementation: true or other annotations seems OK.

Elchi3 commented 4 years ago

Proposal for a data guideline:

Iterables

In WebIDL, APIs can be declared iterable resulting in @@iterator, entries, keys, values and forEach being exposed. In BCD, an abstract feature iterable is added to the API in that case and the individual features of the iterable protocol do not need to be explicitly recorded unless there are compatibility issues worth mentioning.

For example, the NodeList class is declared iterable in WebIDL and is represented as api.NodeList.iterable. It has the description Iterable (<code>@@iterator</code>), like this:

{
  "api": {
    "NodeList": {
      "__compat": {},
      "iterable": {
        "__compat": {
          "description": "Iterable (<code>@@iterator</code>)",
          "support": {}
        }
      }
    }
  }
}

Does this make sense?

ddbeck commented 4 years ago

This seems reasonable to me. I think we might want to expand the description a little (perhaps mentioning for … of?) and show a fake example of a partial implementation, but those are pretty minor details.

A bigger question: would we want one entry for each for iterable, async iterable, setlike, and maplike? Or would we want to have one guideline with a table breaking down what we expect to be exposed for each?

foolip commented 4 years ago

That makes sense for iterable! I suspect things will get trickier if we throw maplike and setlike into the mix, since the things they generate overlap. Any suggestions for how to deal with those at the same time? :)

jpmedley commented 4 years ago

I think before preceding, you should ask @chrisdavidmills to weigh in on how this would be rendered on MDN pages. If I want to know if something supports entries() I don't look for the word "iterable" particularly if I'm not a browser engineer. How do we solve that? Do you have pages for all the things @@iterable resolves to? If so, what do you put in the browser compat table? I guess the key could be InterfaceName.iterable or something like that. But then, we're asking web engineers to understand implementation details to instead of runtime details, which is what they care about.

chrisdavidmills commented 4 years ago

This does sound a bit esoteric for inclusion in BCD tables to me, but I'll wait for @Elchi3 to chime in on this when he is back in work next week. It is nice to include for completion, but should we have such values just listed in with all the other BCD entries?

Elchi3 commented 4 years ago

I suspect things will get trickier if we throw maplike and setlike into the mix, since the things they generate overlap.

Oh okay, that sounds complex and the compat story for the individual features is then depended on what was implemented first?

Any suggestions for how to deal with those at the same time? :)

Not yet, I think looking at some real world examples would help to come up with ideas.

I think before preceding, you should ask @chrisdavidmills to weigh in on how this would be rendered on MDN pages. If I want to know if something supports entries() I don't look for the word "iterable" particularly if I'm not a browser engineer. How do we solve that? Do you have pages for all the things @@iterable resolves to? If so, what do you put in the browser compat table? I guess the key could be InterfaceName.iterable or something like that. But then, we're asking web engineers to understand implementation details to instead of runtime details, which is what they care about.

As discussed above, the alternative is to add all sub features of these concepts to BCD and as MDN pages, but that seems like a lot of work and I'm not sure it is worth it to have pages for entries() and friends again and again. The abstract feature iterable as presented above could also always come with an mdn_url pointing to an explanation, of course. So, I think I'm assuming it is okay to ask web devs to understand these concepts and at the same time also explaining it properly on MDN which then avoids this repetition all over the place, but I haven't user tested this if it would work. So, it is just a guess. Do you have more information on what users actually need in order to understand this?

jpmedley commented 4 years ago

The abstract feature iterable as presented above could also always come with an mdn_url pointing to an explanation, of course.

I had a similar thought.

So, I think I'm assuming it is okay to ask web devs to understand these concepts

I'd rather think of this as a concession to a problem without an easy or obvious answer. Platform features should be described without recourse to implementation details whenever possible. On pages where it appears, I would suggest that it have a description that lists what it resolves to in addition to an explanation page. What I'm proposing is no different from having the same description in a property page as on the property list on the interface page. (I had a look at Chrome's IDL just now. It's only about a dozen pages.)

By the way, where did @@iterator come from? Is that an older IDL member? IDL currently uses iterable. (Changes to IDL are another reason I advocate keeping it off MDN.)

Elchi3 commented 4 years ago

Platform features should be described without recourse to implementation details whenever possible On pages where it appears, I would suggest that it have a description that lists what it resolves to in addition to an explanation page.

I whole-heartily agree to these principles and I try to stick to them when possible, as you say.

By the way, where did @@iterator come from?

This is from the ECMAScript symbol. ECMAScript also has builtin iterables. When APIs have them, IDL annotates them with iterable. In ECMAScript they appear as @@iterator on MDN and in the spec, too.

If web devs want to make own iterators for objects, they use Symbol.iterator. I'm not sure how people generally refer to these symbols. I thought @@iterator is quite common, but maybe we could also say "description": "Built-in iterator", and have an mdn_url pointing to what it means when an object has a natively built-in iterator and listing there all the methods you can use thanks to it.

I think the BCD "description" field isn't really meant to contain a long list of features, so I don't think the following is something we want to add there, or do we?

"description": "Built-in iterator: <code>@@iterator</code>, <code>entries</code>, <code>keys</code>, <code>values</code>, <code>forEach</code> and <code>for...of</code>",

teoli2003 commented 3 years ago

WebIDL also defines another keyword, stringifier, that leads to the creation of a toString method. Each time there is a stringifier on an interface, we should have a link either to the WebIDL entry (https://heycam.github.io/webidl/#idl-stringifiers or the link to the official spec ) or to the prose of the spec.

There is also the toJSON methods (long time ago describes by the WebIDL keywords jsonifier in Gecko's IDL and serializer in the proposed spec) that has an entry in the WebIDL (https://heycam.github.io/webidl/#idl-tojson-operation ) but I think recent specs have it as toJSON because the algorithm wasn't really standard.

foolip commented 3 years ago

@teoli2003 the reason that you'll see both [Default] object toJSON() and toJSON() without [Default] in spec IDL is because the default behavior isn't always what is wanted. For example, JSON.stringify(new URL("http://example.com/")) will return the string '"http://example.com/"' since the serialization of a URL is just a string. The default Web IDL behavior would have serialized an object with a number of properties like "origin" an "protocol" instead.

MattiasBuelens commented 2 years ago

A consistent representation for iterables (and async iterables) would also help TypeScript's DOM types. We want to add the appropriate methods (like .entries() and .forEach()) to the type definitions of iterable interfaces, but only if iteration is supported by at least two browser engines (per our policy).

At the moment, we use the support data of the values() method instead, since that is more commonly defined than @@iterator. This works, but it's a bit awkward. It would be nicer to have a "standard" entry for (async) iteration in BCD.

foolip commented 2 years ago

@MattiasBuelens For async iterables I suppose the feature name should be @@asyncIterator? Do you have a list of interfaces that support this so we could try it out?

MattiasBuelens commented 2 years ago

@foolip Yes, @@asyncIterator seems appropriate.

AFAIK there are only two async iterables in the platform at the moment:

At the time of writing, there are no browsers that support async iterating over a ReadableStream yet. (It would still be nice if it's already tracked in BCD though.)

Async iterating over a FileSystemDirectoryHandle is supported in Chrome and Safari. BCD currently has iteration methods like entries(), keys() and values() in its data, but no @@asyncIterator.

(Off-topic: the MDN docs for FileSystemDirectoryHandle's iteration methods incorrectly claim that they return an "array iterator", while they actually return an async iterator. We should probably fix that sometime. 😛)

foolip commented 2 years ago

@MattiasBuelens I've sent https://github.com/foolip/mdn-bcd-collector/pull/2349 to generate the correct tests. With that, we should be able to add that @@asyncIterator entry for FileSystemDirectoryHandle to BCD based on test results.

hamishwillee commented 1 year ago

FYI, I added an entry to note that ReadableStream is iterable from FF110 in #18824. Wasn't entirely sure of the consensus here so settled on:

      "async_iterable": {
        "__compat": {
          "description": "Async iterable (<code>@@asyncIterator</code>)</code>",

Note, I much prefer the form of the original description in the first post by @foolip - something like:

"async iterable": { 
   "__compat": { 
     "description": "Support of <code>await for...of</code>",

Why? Those symbols are opaque spec language. Meaningless to me, even after reading through the MDN JavaScript docs several times. Feels like we're making life harder for developers to no value.

hamishwillee commented 1 year ago

@Elchi3 Was your comment in https://github.com/mdn/browser-compat-data/issues/6367#issuecomment-661737781 agreed as the data guideline? I ran into this again in https://github.com/mdn/browser-compat-data/pull/18964.

NodeList, is defined in the IDL as iterable, but there is no spec link to this item.

Both the docs and BCD expand this out with subfeatures for: entries(), forEach(), item(), keys(), values(). The problem is that there is nowhere to link the spec to for the expanded methods - we can't even link to iterable. So was wondering what the "approved" way to do this is - since the case does not match the guideline.

FWIW I think ...

what MDN does in fully documenting the supported methods is correct.
that the data guideline makes sense (shown below) from a BCD point of view. My only concern is that if you don't split out the features the from an MDN perspective it is not obvious at the top level document that the "iterable" key is what tells you the compatibility for the methods.
```
...
  "NodeList": {
    "__compat": {},
    "iterable": {
      "__compat": {
        "description": "Iterable (<code>@@iterator</code>)",
        "support": {}
      }
    }
```
Note this is not a problem for the sub documents - because you can directly set the BCD key to be the iterable item.
You could fix that problem by making the text: "Iterable (supports forEach(), values ... and so on )",
We still need a spec URL link. I think that would be the IDL definition whichever way the data rule ended up: https://webidl.spec.whatwg.org/#idl-iterable

teoli2003 commented 1 year ago

What about adding two spec URL links:

one to https://webidl.spec.whatwg.org/#idl-iterable where iterable (and similarly maplike, setlike, …) is defined
one to the position in the relevant spec where the interface is defined as iterable. In the case of NodeList it would be: https://dom.spec.whatwg.org/#interface-nodelist

hamishwillee commented 1 year ago

What about adding two spec URL links:

one to https://webidl.spec.whatwg.org/#idl-iterable where iterable (and similarly maplike, setlike, …) is defined

one to the position in the relevant spec where the interface is defined as iterable. In the case of NodeList it would be: https://dom.spec.whatwg.org/#interface-nodelist

Good lateral thinking. That would be better here.

But I still need to know the agreed pattern for the data guideline.

hamishwillee commented 1 year ago

Further thinking, the suggested data guideline was to add iterable and only expand it out if there was some compatibility issue. Note however that just because the data says iterable does not mean that MDN rendering of this key could not expand out to add the forEach methods.

How do we reach a consensus?

hamishwillee commented 1 year ago

The stringifier case is solved by https://github.com/mdn/browser-compat-data/pull/19121 - essentially you add toString() in this case.

hamishwillee commented 1 year ago

The async_iterable case is being discussed in https://github.com/mdn/browser-compat-data/pull/19094. I have come around to the idea that we should document the actual methods/properties added by the symbol in IDL - because our existing data guideline is add features for all methods/properties of the interface whether or not there is a compatibility issue!

So the only discussion then is how you indicate support for for await of syntax, when it is not a function or method. I lean towards a key that says for_await_of but that doesn't matter as much as making sure that people can find out what the key means - so a description and a link to the MDN docs are a necessity.

mdn / browser-compat-data

Data guidelines for Web IDL iterable, setlike and maplike declarations #6367

Iterables