Raynos / read-stream

Base class for readable streams
MIT License
5 stars 1 forks source link

Peer review. #4

Closed Raynos closed 10 years ago

Raynos commented 11 years ago

I've tried to use stream.Readable correctly to implement a bit of sugar to reduce complexity.

I've outlined how you would wrap push / pull sources in the README, and I'm not sure whether it's correct.

I also wrote a bunch of tests that in more detail describe how I think object streams should work.

The biggest difference actual API difference that's not sugar is being able to stream.push(err) which does stream.emit("error", err) and stream.push(one, two, three, ...) to push in a batch, for the rest it's just thin sugar.

I also made some patches to readable-stream for this code to work. The biggest one is having push always return false if length > highWaterMark. The other one is having onread(null, undefined) act the same as onread(null, "") does for string based streams, i.e. it means "we have no data right now but this is not EOF"

/cc @isaacs @TooTallNate @dominictarr @substack

dominictarr commented 11 years ago

why not just stream.push(one).push(two).push(three) I think push(error) is bad, push is for buffering, but you don't want to buffer the error.

can you link to something describing onread(null, '')? if you go onread(null, null) does that mean EOF? couldn't you just use empty string for that in object streams too?

isaacs commented 11 years ago

why not just stream.push(one).push(two).push(three)

Because push() returns the backpressure flag. False = "please stop pushing for now", and true = "you can push more, there's room".

I think push(error) is bad, push is for buffering, but you don't want to buffer the error.

Yes, this is correct. push() is for data only. Errors should be emitted right now.

can you link to something describing onread(null, '')?

In CryptoStreams, we need a way to signal that the read is completely finished, but no data is available right now. It's a rare edge case, but yay for us, we get to interact with OpenSSL, where this is required.

if you go onread(null, null) does that mean EOF?

Yes, null is always EOF.

couldn't you just use empty string for that in object streams too?

Right, but what if your object stream had empty strings that you wanted to treat as empty strings?

Raynos commented 11 years ago

@dominictarr

can you link to something describing onread(null, '')? if you go onread(null, null) does that mean EOF? couldn't you just use empty string for that in object streams too?

onread(null, "") in buffer mode doesn't add anything to the internal buffer. It's a way for _read to say "I tried to read some data, but the underlying source returned nothing". If you do onread(null, "") in object mode it will literally append the object "" to the buffer.

I think push(error) is bad, push is for buffering, but you don't want to buffer the error.

I like the sugar of push(error) but that's a read-stream feature not a streams2 feature.

why not just stream.push(one).push(two).push(three)

If you had a pipeline of streams a.pipe(b).pipe(c).pipe(d) my interpretation is that pushing chunks one at a time moves the individual chunks one at a time through the pipeline. I think there are good use cases for moving a batch of chunks down the pipeline in a batch but I don't have benchmarks for the advantages of doing that.

Being able to do push(multiple, things, ...) also allows you to return multiple things from a single _read call.

isaacs commented 11 years ago

JavaScript people really need to get more comfortable with just having more than one line to do things.

Returning more than one thing from a single _read call is a bad idea.

Raynos commented 11 years ago

@isaacs can you explain why? I feel like returning batches is a more optimal way to move data out of a pull source.

dominictarr commented 11 years ago

I think raynos has a point here, because otherwise you just endup buffering twice, however that is what push handles for you.

If you can't return more than one thing from a _read call, you will just have to buffer that until the next _read call. then you have buffering in your buffering. maybe it would be more consistent to cb null, [obj] and null, [thing1, thing2] and if you want to read an array null, [[...]]

push(error) is not sugar. it's decidedly sour. It looks like you are creating a stream of errors, and then I can't figure out why it didn't work, and then I have to read the code and then I see an if(data instanceof Error) and then I'm like fukkk u !!! I had to click and read that thing and hurt my brain and you could have just said .emit('error', err) and so I never use your modules again.

Raynos commented 11 years ago

@dominictarr your right it's not idiomatic. I'll remove it.

isaacs commented 11 years ago

If you can't return more than one thing from a _read call, you will just have to buffer that until the next _read call.

this.push(thing1);
this.push(thing2);
cb(null, thing3);

It's not super pretty, but it's better than having to parse out variable args, which is apparently one of the slowest things that you can possibly do in JS.

Raynos commented 11 years ago

@isaacs this.push(thing1) inside _read will call _read synchronously AGAIN if length < hwm

Raynos commented 11 years ago

@isaacs which means that if that second _read call returns data synchronously there will be a thing4 between thing1 and thing2

Raynos commented 11 years ago

@dominictarr removed push(err) in v2.1.1