poelstra / ts-stream

Type-safe object streams with seamless support for backpressure, ending, and error handling
MIT License
65 stars 13 forks source link

Transformer for handling ender errors directly downstream from source #53

Closed MattiasMartens closed 4 years ago

MattiasMartens commented 4 years ago

As discussed in #49

coveralls commented 4 years ago

Coverage Status

Coverage increased (+0.1%) to 96.671% when pulling 5e54caa058bf172387c5da1d3a5058b94fb7b0a2 on MattiasMartens:result into 8e68a4905202c25d36f9a24c1141f2e5e1776ba8 on poelstra:master.

MattiasMartens commented 4 years ago

Jenkins failed me for reducing test coverage by 0.02%, haha.

poelstra commented 4 years ago

Interesting use-case :)

In fact, it's good that you provide an actual use-case, instead of hypothetical one, because it allows to have a discussion about how you'd 'idiomatically' handle this.

For example, I wonder whether you'd really use endCatcher() for this type of scenario:

  1. In this example, it's more likely that the error would be thrown from the write that causes the number of guests to be exceeded. But ok, let's assume it's some other case where we can only decide the fate at the end of everything.
  2. More serious though: you left requestBiggerTable() as an 'exercise to the reader', but I wonder how this would really work in practice: the only way I would see this work, is to start a new stream pipeline, and start everything over from scratch. In that case, it won't be this pipeline itself doing such reconstruction, but some piece of code 'around' this pipeline. So it's most likely also that code that will handle the error, and that code is likely to look at result().

I have no issues with the code of endCatcher() itself, but I think the example scenario presented here is not really giving the right impression on how you'd typically use the library.

For example, a more typical case for endCatcher() that I envisioned, was something like this:

If you let it bubble up to the generic database reader source, it won't know how to handle this, and will thus just 'bounce it back'. To solve this, your application can put the following transform between the db reader and further processing transforms:

getRowsFromDatabase<MyObject>(myQuery)
    .transform(
        endCatcher((err) => {
            if (err instanceof NotAllObjectsProcessedError) {
                // ignore
            } else {
                // re-throw all other errors
                throw err;
            }
        })
    )
    .transform(/* ... other transforms etc ... */)
    .forEach(/* ... */);

So, in a way, the idea behind this transform (to me) is more to 'protect' any upstream elements from certain errors that they should not be 'worried' about.

I'm very interested in hearing your opinion about this though, because it seems like you want to use this in a certain application and you have an actual use for it beyond just this example.

MattiasMartens commented 4 years ago

because it seems like you want to use this in a certain application and you have an actual use for it beyond just this example.

Not really. I'm just interested in a clean, uniform interface. A number of times working with this library I've come up with something I thought was a hack but turned out to be idiomatic. I'd prefer if the standard patterns for handling these edge cases were documented in some way. My preference is to document them by including concrete functions in the library.

I have no issues with the code of endCatcher() itself, but I think the example scenario presented here is not really giving the right impression on how you'd typically use the library.

I agree with this. Structurally, my example isn't really that different from yours, except that the catcher has some "handling" to do before it can handle the error. (A more realistic example might be, repeating your case, let's say we can swallow the error but we do still want to record that it happened in some way.) But yes, most likely this would be handled by the routine that initiated the stream rather than as part of the stream context itself.

Basically, if it's possible to catch errors from an ender() in an intermediate step, I think there should be a way to do so at the source, too, in the interest of consistency.

poelstra commented 4 years ago

thought was a hack but turned out to be idiomatic.

Hmmm, custom handling of corner-cases can be hacky, but 'normal' stuff shouldn't feel hacky. So if you felt that way, things need to improve :)

I'd prefer if the standard patterns for handling these edge cases were documented in some way. My preference is to document them by including concrete functions in the library.

Agreed. I had many ideas to improve things, for example by providing ready-to-use transforms, but I stopped working at the company where we used this, and didn't see a lot of community interest (chicken-and-egg problem too, likely).

Structurally, my example isn't really that different from yours

I disagree: in your example, it's the source itself that wants to handle end()'s result, whereas in my example, the source itself doesn't care, and just uses default behavior (which, arguably will hold for most sources).

So in your example it is still pretty awkward to have to insert an extra transform, solely because you want to handle end()'s result, AND use that to further influence the result(). I would say that even with endCatcher(), your use-case is still solved in a 'hacky way'.

Now, my 'theorem' is that what you're trying to do in this example, won't actually occur in practice. For example like in this case, stuff coming back as an ender error is actually really an error, and it makes little sense to try to make the stream's result behave as if nothing happened. That is, that for almost every source, the default behavior is already going to suffice.

Basically, if it's possible to catch errors from an ender() in an intermediate step, I think there should be a way to do so at the source, too, in the interest of consistency.

Yeah, I do agree that it feels a bit 'asymmetric'. But if that is the case, we probably need a different mechanism altogether (i.e. endCatcher() is very useful, but for another use-case).

For example, we could allow end() to be called with a callback instead of the result promise, and call that callback with end()'s result (in case of error, or always), and have result() be resolved with that. I wonder whether that isn't starting to feel hackish as well.

Note that intuitively, you might think that something like this would be nice:

try {
  await stream.end();
  stream.setResult();
} catch (e) {
  stream.setResult(Promise.reject(e));
}

But I don't like that, because I would like result()'s fate to be decided at most once (and what better place than end()?) and also at least once (e.g. people shouldn't forget to call setResult(), and we can't know whether they just didn't call it because they thought the default was already good enough).

Perhaps an example documenting the defer-trick would be enough, like:

const result = defer();
try {
  await stream.end(optionalEndError, result.promise);
  result.resolve();
} catch (e) {
  // Do stuff with e, and either `result.resolve()`, or `result.reject(e)`, or `result.reject(someOtherError)`.
}

Downside is that defer() is not a standard thing (the implementation is trivial, of course, but still), but writing it down with a Promise itself isn't very readable.

Perhaps this discussion needs to continue in a separate issue, which is then specifically about handling end errors in a source. Then we can get this PR merged code-wise as-is, but with a more idiomatic example of when you'd use it.

MattiasMartens commented 4 years ago

I disagree: in your example, it's the source itself that wants to handle end()'s result, whereas in my example, the source itself doesn't care, and just uses default behavior (which, arguably will hold for most sources).

When I say "structurally", I mean in terms of the code's control flow. The only difference is that one implementation of endHandler() calls an outside function when it catches the error it's equipped for, and the other does nothing.

Regarding the use case that's being served, we have one case where we want to suppress an error. And another case where we want to respond to an error and then suppress it. Maybe one can infer that in my example the source "wants" to be responsible for the error, whereas in your example the source "wants" to be shielded from the error. But at the end of the day, the source is just an abstraction. All that matters really is what the developer wants.

To my mind the whole point of streams is that they're composable; and because they are composable, there is functionally no difference between a source that implements some behaviour and a source piped through a standard transform that implements that same behaviour. Which is why a more elaborate fix is not required.

To reconcile this with what I said above (maybe contradicting myself here):

Basically, if it's possible to catch errors from an ender() in an intermediate step, I think there should be a way to do so at the source, too

To me, endCatcher basically satisfies this requirement because it lets you catch/handle errors right before the source, which handles all the same use cases as doing so at the source. And it lets you do so in a way that is "on the beaten path", i.e., idiomatic and documented. That's good enough for me.

MattiasMartens commented 4 years ago

So in your example it is still pretty awkward to have to insert an extra transform, solely because you want to handle end()'s result, AND use that to further influence the result(). I would say that even with endCatcher(), your use-case is still solved in a 'hacky way'.

I mean, transforms are free, right? Or is it a performance concern?

MattiasMartens commented 4 years ago

Perhaps this discussion needs to continue in a separate issue, which is then specifically about handling end errors in a source. Then we can get this PR merged code-wise as-is, but with a more idiomatic example of when you'd use it.

Yes, my example is not idiomatic and not very well thought-out. As long as we agree there is a valid use-case for this, I'm happy to move forward using your example in the docs, and I don't think there's any need to break anything out for further discussion (for the reasons above).

MattiasMartens commented 4 years ago

I stopped working at the company where we used this,

This is a tangent but, curious if you still make use of this library yourself? Not in your current position perhaps, but in side projects and such?

poelstra commented 4 years ago

I mean, transforms are free, right? Or is it a performance concern?

Well, they are not completely free, because there's always the 'pumping' of that extra stream, of course.

But my concern here was not about performance. I think of a transform as really 'another element' in the pipeline, whereas what you are doing in the original example basically still belongs to that first element (being the source, in this case).

You're right about them being composable though (in fact, that's the whole point of the compose() function), so perhaps it's just a matter of taste.

I'm happy to move forward using your example in the docs, and I don't think there's any need to break anything out for further discussion

:+1:

This is a tangent but, curious if you still make use of this library yourself? Not in your current position perhaps, but in side projects and such?

I did in the past, yes, but not in any project I'm actively working on right now. Which is the reason why I didn't actively develop on it for a while.

poelstra commented 4 years ago

Thanks again @MattiasMartens ! Another welcome addition :1st_place_medal: