Open lucacasonato opened 7 months ago
Has there been discussion somewhere publicly? π
@saschanaz No, not of any specifics. The background here is this issue: https://github.com/whatwg/fetch/issues/1291#issuecomment-2009010502. @annevk mentioned we'd need to figure out how to make this work IDL wise. This is my attempt at that (it enables updating the BodyInit
overload in Fetch from:
typedef (Blob or BufferSource or FormData or URLSearchParams or USVString) XMLHttpRequestBodyInit;
typedef (ReadableStream or XMLHttpRequestBodyInit) BodyInit;
to
typedef (Blob or BufferSource or FormData or URLSearchParams or USVString) XMLHttpRequestBodyInit;
typedef (ReadableStream or async iterable<BufferSource> or XMLHttpRequestBodyInit) BodyInit;
Hmm, I see. BTW please include the full PR template: https://github.com/whatwg/webidl/blob/main/PULL_REQUEST_TEMPLATE.md
I wonder how this ends up working in practice. Say you have Blob or async iterable<T> or DOMString
. If you pass in a Blob
with @@asyncIterable
support, how is it handled? I guess @@asyncIterable
comes later, probably before toString
?
cc @petervanderbeken @natechapin
What is being proposed exactly, to what WebIDL type would it be converted? A reference to an Iterator Record
?
I wonder how this ends up working in practice. Say you have Blob or async iterable
or DOMString. If you pass in a Blob with @@asyncIterable support, how is it handled? I guess @@asyncIterable comes later, probably before toString?
@annevk I have specified this behaviour in an update to the overload resolution algorithm in this PR already. The gist is: an async iterable<T>
has the same behaviour for distinct types as sequence<T>
(which one could think of as the eager synchronous version of async iterator<T>
).
I have updated the graphic in the spec that explains this too:
In practice:
async iterator<T>
can be distinguished from all types but object
and sequence<T>
sequence<T>
, a matching interface is considered more important than a platform object that implements the iterable, or async iterable protocol. So specifically for the BodyInit
with Blob
case: nothing changes.sequence<T>
, primitives like strings, even if they have @@iterator
on their prototype, are not considered for async iterator<T>
overloads, and are also not valid async iterator<T>
values. async iterator<T>
requires values are Type(V) == Object
, in addition to having either @@iterator
or @@asyncIterator
.What is being proposed exactly, to what WebIDL type would it be converted? A reference to an Iterator Record?
@petervanderbeken I am proposing the addition of an async iterable<T>
type, which mirrors sequence<T>
in many ways, but unlike it is:
If you imagine a matrix of JS stream primitives, with asynchronicity on one axis, and eagerness on the other, this fills out the square opposite of sequence<T>
.
sync | async | |
---|---|---|
eager | JS: Array<T> ; IDL: sequence<T> |
JS: Promise<Array<T \| Promise<T>>> ; async sequence<T> (doesn't exist) |
lazy | JS: { @@iterator: Iterator<T> } ; IDL: iterable<T> (doesn't exist) |
JS: { @@asyncIterator: AsyncIterator<T> } ; IDL: async iterable<T> (this PR) |
The use case for this type are Web APIs that take "streams" of values (of unknown length), that they then process in some way. A real Web API that does this right now is ReadableStream.from()
. It takes objects that are iterables as a Web IDL any
type, and performs manual processing of this iterable. For one API, this is fine, but there are other proposed APIs that would also take async iterables as arguments:
new Response(asyncIterable)
: this overload of Response
's BodyInit
aims to simplify interop of fetch bodies with Node's own stream APIs, which implement @@asyncIterable
(see https://github.com/whatwg/fetch/issues/1291).crypto.subtle.digest(algorithm, asyncIterable)
: this overload of subtle digest would allow digesting large sets of data without having to load them into memory entirely (see https://github.com/w3c/webcrypto/issues/73 and https://github.com/w3c/webcrypto/issues/73#issuecomment-1003741806).Introduction of this async iterable<T>
simplifies all of these, because the entirety of the iterable iteration logic is now pulled into Web IDL, where all callers can share it.
What is being proposed exactly, to what WebIDL type would it be converted? A reference to an Iterator Record?
Per https://whatpr.org/webidl/1397/402a886...14f1bdb.html#js-async-iterable the result indeed seem to be iterator record. (Adding because I don't think the comment above actually answers this.)
This design sounds good to me. It seems the Build is currently failing though. Could you write a corresponding PR for Streams to show it in action so we have some assurance it can be adopted?
cc @MattiasBuelens
3. As with
sequence<T>
, primitives like strings, even if they have@@iterator
on their prototype, are not considered forasync iterator<T>
overloads, and are also not validasync iterator<T>
values.async iterator<T>
requires values areType(V) == Object
, in addition to having either@@iterator
or@@asyncIterator
.
This would technically be a breaking change for ReadableStream.from()
, since we currently do accept strings (see this test).
That said, I wrote that test more for the sake of completeness, and I don't expect much actual usage in the wild. We can consider making ReadableStream.from()
throw on strings, or we could change its signature to accept async iterable<any> or string
instead.
@MattiasBuelens I was not aware of this behaviour in Streams. I think we should stick with the behaviour where async iterator<T>
must be an object in this PR (mirroring sequence<T>
). Overload resolution becomes tricky and rather confusing otherwise.
Let's discuss whether to accept or error on DOMString
arguments in the ReadableStream.from()
PR.
@annevk I had already opened a draft PR to the streams spec, showing the integration yesterday. Here is a link: https://github.com/whatwg/streams/pull/1310.
Do we have Web IDL tests in WPT? I don't think so, right? I guess no tests specifically for this are relevant.
Considering this is mostly an editorial refactor, do we even need "implementer interest"? There is nothing to implement until an upstream spec uses this. I'll ask for support anyway though:
new Response(asyncIterable)
. @matteocollina can you comment?Also @jasnell, any thoughts?
Got some feedback from @saschanaz elsewhere that the constraint that async iterable<T>
is only valid in JS->IDL positions (rather than also in IDL->JS positions) should be more prominently highlighted. I'll address this tomorrow
Do we have Web IDL tests in WPT? I don't think so, right?
We do, but they're weird since they test things via APIs that use those features. https://github.com/web-platform-tests/wpt/tree/master/webidl/ecmascript-binding
I guess no tests specifically for this are relevant.
I think linking to the tests for streams using this feature would be good. Especially if you can verify that everything you introduce in this PR is covered by some aspect of the streams tests. (Which is probably the case since the streams tests are quite thorough, IIRC.)
Considering this is mostly an editorial refactor, do we even need "implementer interest"? There is nothing to implement until an upstream spec uses this. I'll ask for support anyway though:
Implementer support for Web IDL stuff like this is tricky and requires a judgement call. I think we should CC the bindings teams and give them a chance to weigh in, so @japhet and @yuki3 for Chromium. For example if they were concerned about how to implement this in the bindings layer, or if they were concerned that introducing this feature would be bad because then too many specs might use it and actually it should be rare.
However, since this doesn't require any immediate changes to any implementations, I think we shouldn't block on requiring such support, if we don't hear back.
Regarding strings, I think we should follow the behavior of async iterator helpers's AsyncIterator.from()
; /cc @bakkot. I cannot tell what that behavior is, however. Iterator.from()
definitely accepts strings. But AynscIterator.from()
calls GetIteratorFlattenable with a second parameter whose value is async
, which is neither accept-strings
nor reject-strings
.
AsyncIterator.from
will accept strings because Iterator.from
does. But do note that it's specifically a coercion method. For something like ReadableStream.from
, which is also a coercion method, you may wish to follow that precedent, but I'm not sure you'd want to do so for other async-iterable-taking things.
The way AsyncIterator.from
works is to look up Symbol.asyncIterator
, and if that fails then fall back to Symbol.iterator
with the result wrapped the same way for await
does (modulo some quibbles not relevant here), and if that fails to use the .next
method directly (which only iterator-taking things should do, which is to say, nothing except AsyncIterator.from
and AsyncIterator.prototype.flatMap
).
And since it will not specifically reject strings, and strings have a Symbol.iterator
, it will accept them, the same way for await
does.
(Sorry the spec for AsyncIterator.from
is out of date; I'm working through other parts of the semantics and it doesn't really make sense to get the spec updated until those are sorted. But this question, at least, is settled.)
Thanks for the quick response. I'm unsure whether we should think of new Response(asyncIterable)
as a "coercion method", but I lean toward yes... and those are our two use cases so far, so, I'm unsure where that leaves us for the general Web IDL type.
Personally I would specify this by making async iterable<T>
reject strings, and then adding or USVString
to those places which should accept strings, just to ensure that future users of the type have actually decided to take strings on purpose instead of it happening by accident.
Plus if you take a string explicitly then you can use it directly instead of having to look up and invoke String.prototype[Symbol.iterator]
, which is going to be... slower.
Thanks for the comments @domenic and @bakkot. I agree strongly with @bakkot that in the Web IDL type, we should only allow objects implementing the async iterable protocol for the async iterable<T>
type. Anything else would be a significant hassle with a high chance for unexpected behaviour when plumbed into the overload / union resolution algorithm. Also there is already precedent in Web IDL that iterable taking types do not support iterating string (sequence<T>
does not support iterating strings, but does iterate objects implementing the iterable protocol).
For the streams specific discussion, lets move it right here: https://github.com/whatwg/streams/pull/1310#issuecomment-2022484496
Folks, could you take another look? I have clarified the allowed positions for async iterables (namely not in attributes), and have added guidance that: a) async iterable IDL values can not be constructed in IDL, and always must come from language bindings b) that spec authors should generally return an interface with an async iterable declaration, not an async iterable IDL value
@annevk has confirmed implementer interest for WebKit on Matrix - is anyone else in favor, or opposed?
This LGTM
Gently pushing for reviews again :)
Still need positions from Chromium (@domenic) and Mozilla (@saschanaz).
TC39 is considering rejecting strings in most (new) iterable-taking positions in general (and of course async iterable taking positions as well). It'll be discussed at the plenary at the end of the month, and I'm guessing it'll be adopted.
@lucacasonato coupling this with the Streams PR I'm happy to consider Chromium and WebKit as interested and nobody opposed. This PR needs rebasing, the checklist in OP needs a bit more work, and maybe @MattiasBuelens can do a final pass after that. With that I think we're good to go.
I'd like to ask @petervanderbeken for the position as he will be the implementer.
A parallel question for distinguishability: currently it says it's distinguishable with interfaces and dictionaries, but what happens if the interface has async iterable<T>;
? sequence also says it's distinguishable, so I think I just lack some knowledge here π
@saschanaz The interface can be distinguished from other objects because it is branded. A union of interface
and async iterable<T>
will always attempt to match the interface
first. This is also why dictionary
and interface
are distinguishable.
Union resolution goes like this in this case: `MyInterface | MyDictionary | async iterable
V
is an instance of the interface. it's an interfaceV
has [Symbol.asyncIterable]
. it's an async iterable@lucacasonato I fixed the remaining Bikeshed issues in this branch. Could you pull those changes into your branch? π
@domenic What tests should I write for this? I don't think we have specific webidl tests in WPT, right? Only tests for features that use the webidl types.
Just to make sure I understand correctly, this is blocked on tests and this is best tested through the Streams change https://github.com/whatwg/streams/pull/1310 which is currently still in draft. So once the Streams change is ready and has test coverage (which should include coverage for the IDL bits) this is all good to go?
So during an experiment we ran for new Response(async iterable<Uint8Array>)
, we discovered another fun edge case to think about: when having a union of string | async iterable<Uint8Array>
, what should a boxed string resolve as? In the current spec text it resolves as a async iterable<DOMString>
, which would cause an error during iteration. This happens because the boxed string is both an object, and has a [Symbol.iterator]
method.
We have two options here:
async iterable
, like we do with primitives (kinda weird, but means this doesn't become a weird edge case only in fetch and is instead uniform across the platform)fetch
, and in the future in any other string | async iterator
WebIDL unions.The big question is whether, assuming ReadableStream.from("asd")
throws, ReadableStream.from(new String("asd"))
should throw, or create a stream of string chunks.
Any thoughts?
Copying over a thought from @annevk:
Treating string objects and values in the same way whenever possible seems preferable. Which argues for 1 as otherwise each API would have to deal with this anew.
In TC39 we went the other direction, and said that String objects are just another kind of object. Having the check conceptually be "is a primitive" is the simplest mental model.
If the user provides a primitive wrapper Object such as a String Object, however, it should be treated like any other Object.
That also seems reasonable, but the problem is that any API currently accepting a string would need to juggle a lot of complexity to safely adopt async iterable<T>
.
Perhaps there's a third solution whereby string | async iterable<T>
cases involving a String object are resolved by going down the string
path? But if there's no string in the type union we'd treat a String object as an Object for the purposes of async iterable<T>
. (Of course, this has the opposite problem for when you start out with async iterable<T>
and later overload with string
, but that seems far less likely.)
I like the option where we just handle this in the union.
This commit lifts the async iterator processing logic from the
ReadableStreamFromIterable
operation in the Streams spec, to a new WebIDL typeasync iterable<T>
.This will clean up the streams spec, and enable a change in the Fetch spec to directly allow taking async iterable as request/response bodies, without users having to pass them to
ReadableStream.from
first.(See WHATWG Working Mode: Changes for more details.)
Preview | Diff