Stop pushing data when push() returns false

danypype commented 6 years ago

Hello @brianc,

According to the node.js documentation on readable._read:

_read() should continue reading from the resource and pushing data until readable.push() returns false. Only when _read() is called again after it has stopped should it resume pushing additional data onto the queue.

I think this is the way a readable stream is supposed to handle back-pressure.

I found out that after calling resume() on a paused readable, node will immediately call readable._read(size) even it there's already a lot of data in the readable's internal buffer.

A way of making sure not too much data gets allocated in the internal buffer is looking at the returned value of readable.push(), and take action according to the documentation.

I've made some modifications and also added a couple of tests to illustrate what these changes aim to fix.

Thank you,

danypype commented 6 years ago

@brianc, what are your thoughts on this?

aheinz-fe commented 6 years ago

Reading through https://github.com/nodejs/node/blob/master/lib/_stream_readable.js, specifically push(), readableAddChunk() and addChunk(), I learned that the objects are stored internally in a linked list:

// A linked list is used to store data chunks instead of an array because the // linked list can remove elements from the beginning faster than // array.shift()

This seems preferable to storing them temporarily in an array.

I think the root of the problem is that batchSize defaults to 100, though highWaterMark defaults to 16. This suggests that a simpler solution might be to limit amount of rows fetched at once with

this.batchSize = Math.max(this.highWaterMark, (options || { }).batchSize || 100)

or

const readAmount = Math.min(this.highWaterMark, Math.max(size, this.batchSize))

or (as a user) to make sure they are set in tandem.

brianc / node-pg-query-stream

Stop pushing data when push() returns false #41