Closed phated closed 4 years ago
the browser Stream spec has error propagation IIRC
@contra yeah, I had been discussing the whatwg stream spec with chris - not sure where in the process it is and if anyone has a solid implementation yet.
@contra looks like there is a reference implementation polyfill at https://github.com/whatwg/streams/tree/master/reference-implementation
Not sure if it would help with the trouble you've been seeing -- but @piranna pointed to fstream
and that looks nice in that it integrates stat
.
AFAIK the Stream spec's error propogation comes from being based on Promises. Without getting into too much depth, Promise error propogation has... issues.
The initial WHATWG Stream spec was based on a lot of feedback from the Node.js Stream implementation and fixing what we got wrong. Since then it's taken a lot of turns and I'm not sure how many of them were to keep spec people and Promise advocates happy and how many were real improvements.
@mikeal the node stream error implementation is much worse than promises and the unhandledRejection
handler in new nodes would be a really nice thing for us to hook into. We have been dealing with a ton of issues that I'm trying to work through with @chrisdickinson and he said that WHATWG Streams actually solve some of them (sometimes-writeable/sometimes-transform streams).
The initial WHATWG Stream spec was based on a lot of feedback from the Node.js Stream implementation and fixing what we got wrong. Since then it's taken a lot of turns and I'm not sure how many of them were to keep spec people and Promise advocates happy and how many were real improvements.
I've been keeping track of the WHATWG stream changes — most of them are well-grounded. The biggest concern I have with them right now is that, last I checked, the byte reader stuff was up in the air, though that may have since changed.
AFAIK the Stream spec's error propogation comes from being based on Promises. Without getting into too much depth, Promise error propogation has... issues.
As someone who was formerly very anti Promise, I'd be very interested in having a conversation with you about this in a separate venue :)
So, unhandledRejection
solves the problem of errors going unhandled but the really hard part about propogation is what you do during splits. This happens in promises and in streams (when streams are piped to multiple outputs) and I've never seen a good solution to this. You either pick a direction to propogate, potentially not sending to the side that has a handler, or you potentially hit two handlers on each side of the split. We've been trying to figure this out since the first standardized Stream implementation landed in Node.js and haven't found a more reliable pattern than requiring people to handle errors in the stream that generates it.
As someone who was formerly very anti Promise, I'd be very interested in having a conversation with you about this in a separate venue :)
Sure :)
For the record, I don't really care about Promises. I don't hate them, and I'm not in love with them, I think they are fine for people that prefer them but I know from experience that many people find them a bigger conceptual barrier than callbacks.
However, I do dislike the proliferation of Promises through the specification process. I can't blame the spec authors though, there isn't a standardized way of doing something in the future or even providing callbacks and event handlers and it probably sucks to open up those parts of their spec to endless bike shedding. But this also means that Promises are showing up in all kinds of places they don't need to be, and I mean this not as a matter of preference but on a purely technical basis. It means that Promises are mostly complicating APIs that would otherwise be much simpler because there just isn't an alternative in the spec world while in the real world we have the Node.js standard callback API, EventEmitters, etc.
@mikeal: Thanks for your response — this is interesting! I'll respond to the Promise part of the conversation on :arrow_right: nodejs/readable-stream#181 so as not to fill up the vinyl-fs tracker.
I of course disagree with all of @mikeal's characterizations here. Promises are being used appropriately, and have no issues with error propagation. But I'm not really interested in debating that, just wanted to counter some of the FUD.
the byte reader stuff was up in the air, though that may have since changed.
It's getting really really nice. Zero-copy everywhere and much less awkward than a separate type. Watch https://github.com/whatwg/streams/pull/418 for the details.
@contra looks like there is a reference implementation polyfill at https://github.com/whatwg/streams/tree/master/reference-implementation
There's a couple issues I'd warn against.
@domenic this wouldn't end up in gulp 4 but is forward-looking towards gulp 5. The reference implementation might be fine to prototype against; however, do you know of any polyfill authors that are actively working on one? I remember seeing a bunch of people jump in like a year ago but haven't heard anything since.
@phated I haven't seen any in active maintenance, no. But, forking from the reference implementation and fixing the worst-performing aspects/de-Traceurizing would probably be a few hours work.
what about just using pump (or something similar) instead of .pipe as the default method of combining streams in the next version of gulp? It would have the advantage of keeping compatibility with current node streams in use.
The reason why I use gulp is because it is based around node streams. It is its greater feature for me, because it makes this compatible with the greater node ecosystem. I also use the vynil-fs in some other ways, and I have created awesome things with it.
I know and have experienced all the problems with Streams, including the difficulty in explaining them to newbies. I think we should fix the problems in node streams, because it will benefit the whole node community.
@phated what are the problems you are experiencing? how can we improve the gulp user experience by fixing streams?
@mcollina I believe the latest saga is documented through issues and PRs:
vfs.dest()
would cause the highWaterMark
to be hit and the stream would stop processing and this was caused by us having vfs.dest()
to sometimes behave as a Transform stream and sometimes as a Writeable stream - #120readable
event to stop being emitted - #142readable
event so you can't pass flowing streams to either of those libraries.readable
or data
events - #145We've also been the biggest proponent of better error management in node streams and were pushing for something like @chrisdickinson's original error management PR on node core. We don't want people to have to use gulp-plumber
in their pipelines and with the amount of people using it, I think error propagation should probably be the default behavior. We also need some sort of error recovery so the stream doesn't get torn down on error.
If you are considering looking at a different stream api, I strongly recommend you have a look at pull-stream
pull-streams can do all the things that node-streams can do, but are dramatically less code. you don't even need a library to implement a pull-stream (it's just a couple of functions). The source and the sink both have a chance to apply back pressure, and errors propagate. If a source errors, that is pretty much like a normal end condition, but if a sink errors that propagates back up and aborts the source.
When you need to interact with the regular node ecosystem, there is stream-to-pull-stream and pull-stream-to-stream modules for interop.
back pressure in pull-streams is much easier to reason about, so you can use use FRP style things like pull.take(5)
which reads 5 things from the source and then aborts it, but since it also have does have fully fledged back pressure you can use it for IO too.
@dominictarr I actually played with the idea of adding pull-streams to gulp recently. They are amazing and so simple, but they are a bit too simple :) The issue is that they are not easily identifiable. Gulp checks what is returned from the task, and behaves accordingly, but pull streams are basically just plain functions. Source is a function with 2 arguments, sink is a function with 1 argument, and through can't be checked at all since it appears as a sink. That dives below the line I considered acceptable when duck typing.
Wouldn't you consider adding a flag to the spec? Something like stream._isPullStream = true
? And maybe another one with the type of the stream? Or maybe combine them into one truthy flag stream._isPullStream = 'sink'
, although that feels weird.
Anyway, my point is, pull streams are a type of async interface that just needs to be reliably identifiable, otherwise tools like gulp can't really use them.
thank you! hmm, I actually had that in early version of pull-stream, (back when there was a .pipe()
function, but that is removed now).
Since pull-streams are just a function, and you don't have to inherit from anything, or depend on any specific module to create a pull-stream, it would mean that every pull-stream module would need to a flag. That is not a reasonable change.
On the other hand, if gulp adopted a convention for accepting pull-streams which gulp modules could follow, which could be returning a pull-stream with a flag for gulp to see, that would be gulp's business.
If you do that it shouldn't start with _
because that conventionally means "private" but or "internal" but I'd argue it's a public api.
Pull-streams are too good for committee people. No classes, composable API, code is unnecessary readable... boring. Just a historical record after reading this crap.
With the awesome work on https://github.com/mafintosh/streamx - we will be switching to that project for our streams implementation. It fixes a lot of our issues and aims to be compatible with node core streams. I'm going to open a new issue for the switch.
After all the trouble we have been dealing with using node streams (error propagation, lost events, etc), I think it might be time to consider using different stream module as the underlying streams for this module. Quite some time ago, https://github.com/caolan/highland was big news, maybe we should review it and see how it handles some of our problems.