nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
104.27k stars 28.06k forks source link

Async iteration over tcp stream produces more chunks #52765

Open JaoodxD opened 2 weeks ago

JaoodxD commented 2 weeks ago

Version

20.12, 22

Platform

Microsoft Windows NT 10.0.19045.0 x64

Subsystem

No response

What steps will reproduce the bug?

Below are two examples of code that read data from a TCP stream. In the first example, data is read using the for-await iteration approach, resulting in 2731 chunks. In contrast, the latest approach employs the stream.on('data') listener and retrieves 1835 chunks.

For-await listing:

import { createServer, Socket } from 'node:net'
import { Readable } from 'node:stream'

const server = createServer(socket => {
  const stream = Readable.from(numbers())
  stream.pipe(socket)
})
server.listen(3000, '127.0.0.1')

const client = new Socket()
client.connect(3000, '127.0.0.1')

const data = []
for await (const chunk of client) data.push(chunk)
console.log(data.length) // 2731 chunks
// console.table(data.map(({ length }) => length))
console.log(Buffer.concat(data).length)
client.destroy()
server.close()

async function* numbers (max = 10_000_000) {
  for (let i = 0; i < max; i++) yield i.toString()
}

event-based listing:

import { createServer, Socket } from 'node:net'
import { Readable } from 'node:stream'

const server = createServer(socket => {
  const stream = Readable.from(numbers())
  stream.pipe(socket)
})
server.listen(3000, '127.0.0.1')

const client = new Socket()
client.connect(3000, '127.0.0.1')

const data = []
client.on('data', chunk => data.push(chunk))
client.on('end', () => {
  console.log(data.length) // 1835 chunks
  // console.table(data.map(({ length }) => length))
  console.log(Buffer.concat(data).length)
  client.destroy()
  server.close()
})

async function* numbers (max = 10_000_000) {
  for (let i = 0; i < max; i++) yield i.toString()
}

How often does it reproduce? Is there a required condition?

Always

What is the expected behavior? Why is that the expected behavior?

Both for-await and on('data') should produce equal number of chunks.

What do you see instead?

for-await produce more chunks

Additional information

No response

lpinca commented 1 week ago

IIRC the async iterator uses readable.read(). There is not guarantee that the flowing and paused mode produce the same number of chunks.