whatwg / fetch

Fetch Standard
https://fetch.spec.whatwg.org/
Other
2.12k stars 332 forks source link

Question about stream handling around fetch requests with integrity metadata #1754

Open trflynn89 opened 6 months ago

trflynn89 commented 6 months ago

Hello - we've slowly been making the Ladybird browser use the streams spec for fetch, and ran into a couple issues with fetch requests that have integrity metadata.

In main fetch step 22, we have (abbreviated here):

22. If request’s integrity metadata is not the empty string, then:
    3. Let processBody given bytes be these steps:
         3. Run fetch response handover given fetchParams and response.
    4. Fully read response’s body given processBody and processBodyError.

Where, in fetch response handover, we create a transform stream to pipe the request's body through:

6. If internalResponse’s body is null, then run processResponseEndOfBody.
7. Otherwise:
    1. Let transformStream be a new TransformStream.
    2. Let identityTransformAlgorithm be an algorithm which, given chunk, enqueues chunk in transformStream.
    3. Set up transformStream with transformAlgorithm set to identityTransformAlgorithm and flushAlgorithm set to processResponseEndOfBody.
    4. Set internalResponse’s body’s stream to the result of internalResponse’s body’s stream piped through transformStream.

The first issue we hit is that when we fully read the response body, we acquire a reader for the body's stream. This locks the stream. I might be missing something, but I don't see anywhere that releases this acquired reader (and thus unlocks the stream). This causes an assertion failure later when the fetch response handover is run - the first step of piping through states:

1. Assert: ! IsReadableStreamLocked(readable) is false.

So should the steps for fully reading the body include releasing the acquired reader in its success / error steps? (Incremental reading appears to leave the stream locked as well).

The second issue we hit is that the body's stream is actually closed by the time we get to piping through to the transform stream. This happens after the body is extracted:

12. If action is non-null, then run these steps in parallel:
    1. Run action.
    Whenever one or more bytes are available and stream is not errored, enqueue the result of creating a Uint8Array from the available bytes into stream.
    When running action is done, close stream.

The stream isn't closed right away there, because we just enqueued some data - rather, it gets marked with as "close requested". When we fully read the response body from main-fetch, we receive the queued data, and then the stream is actually closed.

So then when we enter the transform stream steps, the response body's stream is closed, which prevents the pipe-through operation from actually doing anything (as far as I can tell), because there's no queued data (it was taken by fully-read) and no way to pull more data.

annevk commented 6 months ago

Oh no. I hope @ricea and @MattiasBuelens can help out here. Thanks for raising this!

ricea commented 6 months ago

I think the key point is this step:

4. Set internalResponse’s body’s stream to the result of internalResponse’s body’s stream piped through transformStream.

This replaces internalResponse’s body’s stream with the result of the pipe operation. The original stream is now locked, but future steps use the output of the pipe operation, which is not locked.

trflynn89 commented 6 months ago

But that happens after the pipe operation - the body's stream gets set to the result of piping its current stream through transformStream. And it's that piping operation itself that immediately fails, as the readable stream it is provided (the body's current stream) is locked and closed.