paulmillr / readdirp

Recursive version of fs.readdir with streaming api.
https://paulmillr.com
MIT License
378 stars 51 forks source link

simplify and fix race condition #140

Closed ronag closed 4 years ago

ronag commented 4 years ago

Tries to address:

It's easier to just look at _read than the diff.

broofa commented 4 years ago

Pulled this locally and ran the example I provided in #138 and it works. (Yay!) As does examples/list.js.

What happens if push() returns false before the _read() for-loop completes? The for-loop continues until all promises have been processed (i.e. pushed), regardless, which would seem to be in direct conflict with the expected behavior:

_read() should continue reading from the resource and pushing data until readable.push() returns false. Only when _read() is called again after it has stopped should it resume pushing additional data onto the queue.

Is this going to cause problems? (I'm not a streams guru, but my reading of the Readable source in node.js has me thinking it will buffer chunks, even if the buffer is above the highwater mark. It just won't call read() again until it's internal buffer of chunks is falls back below the highwater mark. Is that right?)

ronag commented 4 years ago

Is this going to cause problems?

No, it will just buffer the chunks. The return value is just a hint when it is false.

(I'm not a streams guru, but my reading of the Readable source in node.js has me thinking it will buffer chunks, even if the buffer is above the highwater mark. It just won't call read() again until it's internal buffer of chunks is falls back below the highwater mark. Is that right?)

Correct. Yes, Readable is very difficult to follow.

paulmillr commented 4 years ago

Why did you remove highwatermark: 1? This breaks ENOENT test consistently. Or, our test is written in an invalid way?

ronag commented 4 years ago

Why did you remove highwatermark: 1? This breaks ENOENT test consistently. Or, our test is written in an invalid way?

Performance reasons. It will stop buffering after a single element.

Which test breaks?

ronag commented 4 years ago

should emit warning for missing file?

ronag commented 4 years ago

Yes, not sure how I missed that. Probably some last minute change. Apologies. I'll sort it out ASAP.

ronag commented 4 years ago

I don't understand this test. What is it supposed to test?

It shouldn't stat something that doesn't exist? It seems to make some strange assumption that the directory will be listed, then paused, and then when the stream is resumed it will be statted. Which is a bit strange.

paulmillr commented 4 years ago

Consider a situation:

  1. readdirp() is initialized on some big root directory
  2. readdirp() receives path a/b/c to its queue
  3. readdirp is reading something else
  4. a/b gets deleted, so stat()-ting a/b/c would now emit enoent
  5. We should emit warnings for this case.
ronag commented 4 years ago

Ok. I undestand. I don't think the test is "correctly" written. We need x > highWaterMark number of entries and then delete an entry which is above that number in the sequence.

I'll sort it out.

ronag commented 4 years ago

Ah, I can see you already fixed it. Anyway, here is a PR. https://github.com/paulmillr/readdirp/pull/141