Open j-nolan opened 1 week ago
Update: my theory is that something goes wrong in ReadableStreamFrom(), which is called because a Passthrough
is not an instanceof ReadableStream
. For some reason, this function drops the 2-3 first chunks. If I wrap the Passthrough
into a ReadableStream.from(...)
, the problem disappears.
This is just a workaround for now. I'm trying to understand the issue with ReadableStreamFrom()
.
// index.js
const fs = require("fs");
const { PassThrough } = require("stream");
async function run() {
const readStream = fs.createReadStream("my-file.txt"); // you will need a file large enough (>0.5MB)
const passThrough = new PassThrough();
passThrough.on("data", (e) => {
console.log("chunk length", e.length);
});
readStream.pipe(passThrough);
const url = `http://localhost:8000`;
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "text/plain",
},
body: ReadableStream.from(passThrough), // this is the workaround
duplex: "half",
});
console.log("response status:", response.status);
}
run();
Update: The issue was not in ReadableStreamFrom()
After more digging, here is the problematic sequence:
const readStream = fs.createReadStream("my-file.txt");
creates a ReadableStream
in pause
modepassThrough.on("data", ...)
switches the Passthrough
to flowing
modereadStream.pipe(passThrough);
switches readStream
to flowing
mode, because it is in flowing
mode itselffetch
reads from passThrough
, many ticks later, the first few chunks have been missedA better work-around than suggested above is therefore to ensure passThrough
is not in flowing
mode when readStream
is piped to it. This can be done with pause()
:
const passThrough = new PassThrough();
passThrough.on("data", (e) => {
console.log("chunk length", e.length);
});
passThrough.pause();
readStream.pipe(passThrough);
Full fixed snippet:
const fs = require("fs");
const { PassThrough } = require("stream");
async function run() {
const readStream = fs.createReadStream("my-file.txt");
const passThrough = new PassThrough();
passThrough.on("data", (e) => {
console.log("chunk length", e.length);
});
passThrough.pause();
readStream.pipe(passThrough);
const url = `http://localhost:8000`;
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "text/plain",
},
body: passThrough,
duplex: "half",
});
console.log("response status:", response.status);
}
run();
I think this issue can be closed, but I'll leave it open for a couple days in case someone is willing to proof-check my reasoning.
Version
Repro confirmed on at least v23.2.0, v20.14.0
Platform
Repro both on Linux and Windows
Subsystem
No response
What steps will reproduce the bug?
Run a basic HTTP server supporting HTTP chunked requests (
python3 server.py
):Write a long text in
my-file.txt
. The file must be large enough to be chunkable (~0.5MB). (Here is a sample file for your convenience).Run this NodeJS code (
node index.js
):How often does it reproduce? Is there a required condition?
This only happens when the ReadableStream passed as a body to the
fetch
function is piped into aPassthrough
.What is the expected behavior? Why is that the expected behavior?
The number of chunks read by the
Pipethrough
is identical to the number of chunks received by the HTTP server.What do you see instead?
The server consistently receives fewer chunks than were sent from the client:
Client (7 chunks):
Server (only 5 chunks):
The number of chunks effectively sent varies and seems somewhat random (sometimes 4, sometimes 5).
The issue does not occur when the ReadableStream is not piped through Pipethrough. This implies that the Pipethrough somehow impacts how
fetch
consumes the ReadableStream.Additional information
Apologies if I'm missing something. As far as I can tell, this is not the expected behavior of Pipethrough