Middleware fails to propagate own exceptions in async mode

ring-clojure / ring

Clojure HTTP server abstraction

MIT License

3.77k stars 520 forks source link

Middleware fails to propagate own exceptions in async mode #403

Closed mszajna closed 7 months ago

mszajna commented 4 years ago

The contract of Ring async handlers is to eventually either call respond or raise. The Concepts page doesn't go into that much detail but I infer there is no semantics attached to a handler throwing an exception given jetty adapter ignores them.

ring.middleware.* follows the pattern of separating request and response processing of:

(defn wrap-xxx [handler options]
 ([request respond raise]
  (handler (xxx-request request options)
           (fn [response] (respond (xxx-response response options))) raise)))

If xxx-request or xxx-response was to throw, because of a bug, an unprocessable request/response value, as a result of faulty options or executing code supplied in the options, the exception bubbles instead of being raise-d.

Example:

((ring.middleware.cookies/wrap-cookies
  (fn [_ respond  _] (respond {:cookies {:a 1}}))
  {:encoder (fn [_] (throw (ex-info "fail" {})))})
 {} println println)
=> Execution error (ExceptionInfo) fail
; expected
=> nil
Execution error (ExceptionInfo) fail

All middleware following this pattern that cannot guarantee not to throw is affected. Frankly, rarely any useful code can make such guarantees. It would be easy to dismiss the problem on the grounds of breaking the contract, saying encoder is not allowed to throw for example. In practice, real world code throws in unexpected ways - what suffers here is resilience.

The compromised pattern seems to have been picked up by third party libraries.

It's actually rather tricky to comply with the contract. It requires careful scoping of try-catch blocks (made less cumbersome with try-let):

(defn wrap-xxx [handler options]
 ([request respond raise]
  (let [[request' ex] (try [(xxx-request request options)]
                        (catch Throwable t [nil t]))]
    (if ex
      (raise ex)
      (handler request'
               (fn [response]
                 (let [[response' ex] (try [(xxx-response response options)]
                                        (catch Throwable t [nil t]))]
                   (if ex
                     (raise ex)
                     (respond response'))))
               raise)))))

Apart from this being a genuine bug, I wanted to raise the issue as a demonstration of how hard it is to comply with the 3-arity contract even in the simple case, and what's even worse, decide if any given implementation adheres to it or not. The issue makes question whether the contract's complexity had been fully considered, and the burden of verifying compliance acknowledged before adoption. If the assumption was that users won't interact with callbacks directly, the ecosystem of faulty libraries begs to differ.

With Ring 2.0 updating the contract anyway, you are in position to reconsider this decision. I understand it's a very difficult decision to make, with the Clojure community's continued failure to arrive at a single, widely adopted standard. Still, I believe avoiding commitment to any particular solution proved to introduce more problems than it mitigates.

Lastly, for Ring 1.x middleware authors, and perhaps for the benefit of this library, you could introduce a middleware factory taking a synchronous -request and -response function and maybe some exception processor, similar to the interceptor pattern. I'm happy to propose an implementation if you're interested.

weavejester commented 4 years ago

Raising an exception is fine, so long as it is caught within the calling thread. Consider:

(defn handler [request respond raise]
  (future
   (try (respond (make-response request))
        (catch Exception ex (raise ex)))))

If we apply middleware to this, any exception thrown within the future will be converted to a raise. It's generally simpler to wrap the new thread than every piece of middleware.

It may be an idea to automate this pattern. Either:

(defn respond-with-raise [respond raise response]
  (try (respond response) (catch Exception ex (raise ex))))

Or:

(defmacro with-raise [raise & body]
  `(let [raise# ~raise] (try ~@body (catch Exception ex# (raise# ex#)))))

Now, I understand that this is one more thing that the end developer needs to consider when working directly with Ring, but I don't see a good alternative. In Java, if you open up a new thread, either explicitly or implicitly, you have to have some way of handling exceptions in that thread. Even if middleware automatically handled this case, the end developer would still need to handle it for their code.

What alternative design for asynchronous handlers did you have in mind?

mszajna commented 4 years ago

Experience developing await-cps suggests it's not that simple.

The code that executes as a part of respond logically happens after the handler has 'returned'. Having exceptions bubble back to the handler not only could have unexpected interaction with client try/catch code, it could cause double execution of raise. Consider:

(defn handler [request respond raise]
  (future
    (try (respond {:status 200})
      (catch Exception ex (raise ex)))))

(defn wrap-xxx [handler]
  (fn [request respond raise]
    ; first middleware chooses to raise on respond
    (handler request (fn [_] (raise (ex-info "xxx" {}))) raise)))

(defn wrap-yyy [handler]
  (fn [request respond raise]
    ; second middleware's raise handler fails
    (handler request respond (fn [ex] (println "yyy") (throw (ex-info "oops" {}))))))

(-> handler
    wrap-xxx
    wrap-yyy
    (apply {:request-method :get} println println nil))
=> #object[...]
yyy
yyy

While it is possible to redefine the 3-arity contract so that handlers are allowed to throw, outside responding and raising, it only solves the problem on the way down to the handler. Middleware still needs to handle the exceptions in callbacks. Also, note that this would too be a breaking change.

I don't think the problem is related to Java threads. The issue is not that the unhandled exceptions silently get swallowed (this is annoying but there are workarounds), but that they don't correctly propagate via raise. I do agree it is just as much of a burden for end developers.

What alternative design for asynchronous handlers did you have in mind?

Let me answer that in a separate comment.

weavejester commented 4 years ago

Having exceptions bubble back to the handler not only could have unexpected interaction with client try/catch code, it could cause double execution of raise.

If the error handler isn't robust, yes, but robustness is a criteria of a good error handler. An error handler should handle any reasonable exceptions that arise from its operation.

I also don't anticipate middleware to need to override raise often. The only use-case I can think of is an error reporting layer, which would necessarily be at the outer layer. Can you think of another reasonable use-case?

While it is possible to redefine the 3-arity contract so that handlers are allowed to throw, outside responding and raising, it only solves the problem on the way down to the handler.

There's no contract that says that 3-arity handlers cannot throw exceptions; it's just not a good idea to do so if you switch threads because those exceptions won't be caught.

mszajna commented 4 years ago

Your reply sounds a little bit dismissive. I'm under the impression that, going with continuations, Ring is trying to avoid choosing an abstraction for asynchronous processing, but I believe that, unintentionally, without understanding the implications, it created one. With the pattern of embedding 1- and 3-arity handlers in a single function it is implied that the two have equivalent semantics. In synchronous Ring exceptions propagate by virtue of the call stack. The callback model of 3-arity subverts that, replacing the natural call stack with continuations, where the call stack is effectively inverted. It's a reasonable expectation that the semantics of error propagation of a synchronous handler translate reliably to the asynchronous counterpart. If Ring is not precise about the way to achieve it, correct software is impossible.

Having exceptions bubble back to the handler not only could have unexpected interaction with client try/catch code, it could cause double execution of raise.

If the error handler isn't robust, yes, but robustness is a criteria of a good error handler. An error handler should handle any reasonable exceptions that arise from its operation.

While I agree that that error handler shouldn't ever throw I think this is not about robustness but correctness. The duplicated error handler execution is just one example. I don't believe it's possible to define a continuation based contract that's consistent with semantics of a regular call stack, where continuations are allowed to throw. If such consistency is not a goal it really ought to be documented in the big print as it has limitations not only for those that choose to invoke callbacks manually but equally for those that use adapters to abstractions that do pursue that consistency.

I also don't anticipate middleware to need to override raise often. The only use-case I can think of is an error reporting layer, which would necessarily be at the outer layer. Can you think of another reasonable use-case?

As far as 'reasonable' use-cases go even rendering of an error page sometimes blows up. But again, I don't believe this is about reasonable use-cases but the general case. Error handlers do work of arbitrary complexity. Middleware is not just for libraries - end users abstract certain parts of their software with it too. Allowing one faulty middleware to have a knock-on effect on the correctness of the whole stack is the opposite of resilience.

To be clear, in the model with catch-all per thread that you proposed, respond handlers that throw are prone to problems just as much. Even though I'm happy to provide examples, bear in mind that a contract that is hard to reason about is a minefield of similar problems.

(defn wrap-finally [handler]
  (fn [request respond raise]
    (handler request
      (fn [response] (println "finally") (respond response))
      (fn [ex] (println "finally") (raise ex)))))

(defn wrap-xxx [handler]
  (fn [request respond raise]
    (handler request (fn [_] (throw (ex-info "xxx" {}))) raise)))

(-> handler ; the same as before
    wrap-finally
    wrap-xxx
    (apply {} println println nil))
=> #error{}
finally
finally

There's no contract that says that 3-arity handlers cannot throw exceptions; [...]

As stated at the top of this issue, I inferred that through the behaviour of jetty adapter. Perhaps it's a bug and throwing is permitted. Either way, Ring needs to be clear which is the case as both carry extra responsibilities - catch-all in all middleware or catch-all in every new thread. I used the word 'redefine' as a decision to go with either could break existing clients.

mszajna commented 4 years ago

What alternative design for asynchronous handlers did you have in mind?

This perhaps deserves a separate issue but I'm going to try to be brief.

Other platforms foster correctness using a construct of a potentially failed value wrapper, that forces the use of monad-like 'bind' operation to unwrap the value, and re-wrap any new failures that this might generate. A handler would either return that wrapper, or throw trying. Note that the latter bears all the composability guarantees thanks to 'bind'.

For Clojure, available options I can think of are manifold Deferred, promesa IPromise and Java 8 CompletableFuture. core.async is not such option as it lacks error semantics leading to similar problems as the one being raised here. A third party dependency in the contract is undesirable making manifold and promesa a hard sell. I think that leaves the platform built-in CompletableFuture.

Java 8 has been a requirement for long enough that I think it is a safe choice in that regard. The API of CompletableFuture or the interface of CompletionStage is admittedly unnecessarily convoluted. There are popular Clojure libraries that wrap it, like promesa. Even manifold works with CFs, although it requires manual conversion back. There are projects aiming to implement async/await on top of CF.

hato is a Java 11 http client that speaks CF natively.

A separate problem is the distinction between synchronous and asynchronous middleware. 3-arity solves the problem neatly but with CF that's not an option. This is pure speculation but I'm thinking allowing the handler to return a CF or regular response would be an option if Ring shipped with a bind operation in the utils package, wrapping the intricacies of CF API in the process. This would be backward compatible with synchronous Ring.

(defn response-bind [cf-or-map f] ; f, just like a handler, returns cf-or-map
  (if (instance? CompletionStage cf-or-map)
    (.thenCompose ^CompletionStage cf-or-map
      (reify Function
        (apply [_ response]
          (let [cf-or-map' (f response)]
            (if (instance? CompletionStage r')
              cf-or-map'
              (CompletableFuture/completedFuture cf-or-map'))))))
    (f cf-or-map)))

(defn wrap-xxx [handler]
  (fn [request] (-> request xxx-request handler (response-bind xxx-response))))

Disclaimer: not tested. Catching of errors not covered, although they propagate correctly as it stands.

weavejester commented 4 years ago

Your reply sounds a little bit dismissive.

That wasn't the intent, but you do need to make a very good case to change existing behaviour, and I need to play the part of the defense and question everything.

With the pattern of embedding 1- and 3-arity handlers in a single function it is implied that the two have equivalent semantics.

I don't believe that's implied. When I see callbacks, I don't assume that they will act the exact way as a synchronous function would, particularly when comparing error handers to try-catch blocks.

Even though I'm happy to provide examples, bear in mind that a contract that is hard to reason about is a minefield of similar problems.

Examples are necessary to ground any discussion. If you say something is hard to reason about, that's just opinion unless you back it up with use-cases that demonstrate unintuitive or complex reasoning.

Regarding your wrap-finally example, I can see how that behaviour differs from the equivalent synchronous code, and while the behaviour is predictable when you look at the code, if a developer switched from a synchronous to an asynchronous version, they may be surprised by how the behaviour changes.

That said, is this a use-case that will be encountered regularly? Again, the only realistic use-case I can think of for overriding raise is to provide an error handler. I suppose you could also use a finally to close a resource at the end of an asynchronoous handler, but I haven't seen an example of this in practice,

There's no contract that says that 3-arity handlers cannot throw exceptions; [...]

As stated at the top of this issue, I inferred that through the behaviour of jetty adapter.

The Jetty adapter displays a 500 error response in either case, whether through raise or by throwing an exception within the calling thread. I'm confused why you think the Jetty adapter ignores exceptions in the 3-arity case.

weavejester commented 4 years ago

I think that leaves the platform built-in CompletableFuture.

My initial impression is that CF does not look particularly straightforward, but I'll reserve judgement until I see a more complete implementation of what would be needed for a CF-based handler/middleware architecture.

mszajna commented 4 years ago

The Jetty adapter displays a 500 error response in either case, whether through raise or by throwing an exception within the calling thread. I'm confused why you think the Jetty adapter ignores exceptions in the 3-arity case.

My bad. I was curl-ing without -v and (with ring 1.7.0) was getting a blank page in case of bubbled exceptions, while raise-d exceptions yield some HTML. I'm happy to assume that 3-arity handlers are allowed to throw. This is an important consideration for middleware authors if they need to switch threads so I believe this still ought to be documented.

Again, the only realistic use-case I can think of for overriding raise is to provide an error handler.

Yes, raise callback is the error handler. We seem to disagree as to whether it is reasonable for Ring to make any assumptions about what sort of logic can and cannot go in there.

Here's another example, showing that respond can get duplicated too:

(defn wrap-catch [handler]
  (fn [q r e] (handler q r (fn [ex] (r {:status 500})))))
(defn wrap-println [handler]
  (fn [q r e] (handler q (fn [response] (println "here") (r response)) e)))
(defn wrap-fail [handler]
  (fn [q r e] (handler q (fn [response] (throw (ex-info "fail" {}))) e)))
(-> handler ; as before
    wrap-catch
    wrap-println
    wrap-fail
    (apply {} println println nil))
=> #object[...]
here
here

mszajna commented 4 years ago

My initial impression is that CF does not look particularly straightforward

Conceptually CF is equivalent to JS Promise that a lot of developers are familiar with. The API is incredibly bloated because Java's proliferation of functional interfaces, but I believe everything else can be expressed by a combination of handle and thenCompose. I'm not sure to what extent developers are aware of this and how much responsibility for education would be on Ring. Tools like promesa abstract all of that away but middleware library authors may not want to introduce that dependency. Ultimately, the API still forces you to do the right thing (bar obtrude* and get, not part of CompletionStage interface).

but I'll reserve judgement until I see a more complete implementation of what would be needed for a CF-based handler/middleware architecture.

In terms of ring-core, the response-bind and the updated middleware pattern above seem to be all there is. Otherwise, ring-devel does some error handling that requires exception-bind of some sort, and ring-servlet needs to anticipate CF as well. Would a PR be appropriate at this stage?

weavejester commented 4 years ago

Yes, raise callback is the error handler. We seem to disagree as to whether it is reasonable for Ring to make any assumptions about what sort of logic can and cannot go in there.

To clarify my position, I don't believe that asynchronous handlers need to act identically to synchronous ones in all circumstances. If there are conditions where the behaviour diverges, that's not an automatic failure, especially if those conditions are unlikely to be experienced in normal usage.

That's not to say that a waterproof abstraction is something we don't want to aim for, but that needs to be weighed with other concerns. A leaky but simple abstraction may be preferred over a watertight but complex one.

Would a PR be appropriate at this stage?

So long as it's clear that any PR is going to take some time to be merged, if indeed it is merged at all. However, at least initially I'd be most interested in examples of usage. How does it compare with the current system for ease of use?

Also be aware that Ring needs to continue supporting the existing behaviour, so any alternative asynchronous system needs to be in addition to the existing 3-arity behaviour, even if the latter is ultimately deprecated. This may have implications for performance and complexity.

Ultimately we need to demonstrate that the existing system has enough problems in real world usage to justify any additional system. This is possible, but it's not going to be an easy sell.

mszajna commented 4 years ago

A leaky but simple abstraction may be preferred over a watertight but complex one.

The way I understand it, leakiness is what makes abstractions complex. When you're forced to consider what other components might do, the composability suffers. Simplicity is the lightweight state of confidence in non-interaction.

Ultimately we need to demonstrate that the existing system has enough problems in real world usage to justify any additional system. This is possible, but it's not going to be an easy sell.

Makes perfect sense. I understand the proposal around CompletableFuture is a major change, not to be taken lightly. I've raised this issue in an attempt to demonstrate some of the problems with the status quo. I want to stress that CF proposal is a separate issue from the exception propagation issue in the existing model, even if the former is inspired by the latter. There are possible fixes for 3-arity, even if applying them makes it less attractive. I'll raise a separate GH issue for CF.

Back to 3-arity, would you agree that the catch/println/fail example above is quite real-world? It shows clearly that, with inverted stack, error handlers carry just as much complexity as the successful callbacks. Side effects in middleware are not unheard of.

weavejester commented 4 years ago

The way I understand it, leakiness is what makes abstractions complex.

In this case, the leakiness stems from the expectation that synchronous middleware will act the same as asynchronous middleware, but as you point out, there are edge cases where this is not true.

However, synchronous middleware doesn't interact with asynchronous middleware. There's no additional complexity by supporting both types, as there's no interaction between calls. If Ring only supported the 3-airty form, there'd be no leakiness, because there'd be no expectation that the callbacks provided would act the same as a try-catch block.

Back to 3-arity, would you agree that the catch/println/fail example above is quite real-world?

No, in fact quite the opposite. Why would you put an error reporting handler at the bottom of your stack? It should be the outermost middleware.

In general I can think of two scenarios you'd want to override raise. The first is to report errors, which would be the outermost middleware as mentioned. The second scenario is where you'd want to clean up a resource after an exception; this is less common due to pooling, but a conceivable scenario. Can you think of any others?

Also, do problems only occur when there are additional side-effects in repond or raise?

mszajna commented 4 years ago

You seem to be defending the pattern with handlers catching and raising callback exceptions pretty strongly. Does that mean you consider it to always have been a part of the contract, and not a new responsibility being pushed on the handler? Or is it that you deem that responsibility to be so much easier to satisfy than the expectation for middleware callbacks not to throw, that you're open to accept the edge cases?

If Ring only supported the 3-airty form, there'd be no leakiness, because there'd be no expectation that the callbacks provided would act the same as a try-catch block.

The difference is subtle and unintuitive. Your statement would be fair if Ring was upfront about these quirks and limitations. I wonder how much aware had the Ring community, including yourself, been before this issue have been raised? How many developers would be as willing to sacrifice composability, had they been informed?

In general I can think of two scenarios you'd want to override raise. The first is to report errors, which would be the outermost middleware as mentioned. The second scenario is where you'd want to clean up a resource after an exception; this is less common due to pooling, but a conceivable scenario. Can you think of any others?

I think it's important to distinguish errors from exceptions. Unhandled exceptions indeed are errors - there's not much you can do about them but report and it makes sense to handle them at the outermost place. But then, there are all the other exceptions that developers throw, anticipate and handle as appropriate. There are few assumptions we can make as to where these originate in and what is the right way to address them. That depends on business requirements, architecture and developer's imagination, limited by any contracts.

Now, middleware is a pattern for extracting a piece of logic from a handler for reuse or code structuring, an abstraction. Developers use that pattern to implement all sorts of behaviour. Sometimes that behaviour involves the handling of exceptions. For instance, buddy.auth uses exceptions to communicate declined authorization. In the current form it only handles the ones raised in the same thread. It's not hard to imagine the intent to handle the ones raised asynchronously however, either for completeness of the throw-unauthorized facility, or to implement a 'backend' that does IO in a non-blocking way (I'm thinking OAuth2 access token resolution).

The point is, I don't think it's possible or useful to guess what users do with middleware. The ones you see published in libraries are just the tip of the iceberg that happen to be reusable. Unless the contract states clearly that middleware is not to turn a raise into a respond, except for the outermost one, it's only fair to assume that someone is doing it.

Certainly, 1-arity Ring does not ask what developers do with it. That's what frameworks do. I, for one, was attracted to Ring precisely for this reason.

weavejester commented 4 years ago

You seem to be defending the pattern with handlers catching and raising callback exceptions pretty strongly. Does that mean you consider it to always have been a part of the contract, and not a new responsibility being pushed on the handler?

There's no other way it could work. If you have code that may execute outside the calling thread and is intended to return a response, you have to catch any unhandled exceptions regardless. If you don't, exceptions thrown by that code will cause the handler to time out:

(defn handler [request respond raise]
  (future (/ 1 0) (respond {:status 200}))

Your statement would be fair if Ring was upfront about these quirks and limitations. I wonder how much aware had the Ring community, including yourself, been before this issue have been raised?

I'm not sure I understand. A callback obviously won't work the same way as throwing an exception. I don't believe that anyone who seriously looked over the original proposal was under any illusions that there wouldn't be differences.

Unhandled exceptions indeed are errors - there's not much you can do about them but report and it makes sense to handle them at the outermost place. But then, there are all the other exceptions that developers throw, anticipate and handle as appropriate. For instance, buddy.auth uses exceptions to communicate declined authorization.

Let's delve a little more into that. When is it appropriate to catch an exception?

The buddy.auth middleware uses exception for control flow, but I'm cautious about encouraging this, as it's generally considered to be an anti-pattern. Outside of the internals of complex macros like core.match, I haven't seen this pattern commonly used in Clojure, and I'm inclined to go with the common wisdom in this case.

Another possible scenario is some form of fault tolerance middleware. We could imagine some middleware that watches for specific exceptions and retries the handler on error. For example, with again:

(defn wrap-again [handler options]
  (fn [request]
    (again/with-retries options
      (handler request))))

Is this a feasible pattern? I haven't seen this in the wild, and it doesn't seem a particularly good idea to retry the entire handler, rather than a narrower portion.

A third possibility is cleaning up resources on error. If there's some middleware that opens a connection or some other resource, and needs to clean up that resource after completion:

(defn wrap-resource [handler options]
  (fn [request]
    (let [resource (open-resource options)]
      (try (handler request)
           (finally (close-resource resource))))))

This seems the most plausible, though I admit I haven't actually seen this pattern in practice. Database connections are generally pooled, or else opened and closed within the handler.

Can you think of any other scenarios? I'd like to get an idea of realistic use-cases where asynchronous handlers may potentially behave in an unintuitive fashion, and I believe that implies some form of overridden raise.

The point is, I don't think it's possible or useful to guess what users do with middleware. The ones you see published in libraries are just the tip of the iceberg that happen to be reusable.

If we cannot guess what users do with middleware, then we have to rely on user reports. But there have been no reports to my recollection, aside from yours, of any unintuitive behaviour relating to the 3-arity form of handlers. And there have been no reports of problems that have occurred in real-world production systems.

Now it's possible that few people are using Ring's asynchronous handlers compared to other solutions, like Pedestal's interceptors, or manifold in Aleph. I have no way of knowing to what extent a particular feature of Ring is used. I would expect that after three years from release, if there were significant problems in practice that I'd get at least one issue or email, but that expectation may be based upon faulty assumptions.

However, we cannot replace an existing system without some evidence that there problems in real-world use-cases. If there are no user reports, then due dilligence would suggest that we at least try to come up with plausible scenarios where there could be a problem.

mszajna commented 4 years ago

You seem to be defending the pattern with handlers catching and raising callback exceptions pretty strongly. Does that mean you consider it to always have been a part of the contract, and not a new responsibility being pushed on the handler?

There's no other way it could work.

The question is specifically about exceptions bubbling from the callbacks. While the responsibility to raise exceptions coming from the code you control is understood, the responsibility to catch and raise exceptions coming from respond or raise is not that clear. Most certainly there is another way it could work. The contract could compel middleware to make sure the callbacks do not throw at all, raise-ing any exceptions that might arise. As described earlier in this thread, this approach is free of the edge cases above.

I was a bit confused when I first realised the implications of a throwing callback, but couldn't find any write-ups on it, so I went on to assume they're not supposed to do that. I guess your replies suggest I was wrong. But still I wonder, how many developers might have assumed the same thing? How many developers didn't give it enough thought and failed to wrap respond in a try-catch, producing a latent bug?

I wasn't able to find many async Ring examples online, but here is one from a popular library where callback exceptions are ignored. Incidentally, your recent response ignores that subject too, although I understand it's just a snippet on reddit. Any chance you can point me to a code base/example that gets it right? I couldn't find anything else.

The buddy.auth middleware uses exception for control flow, but I'm cautious about encouraging this, as it's generally considered to be an anti-pattern.

So is callback hell, yet here we are. Exceptions are a language construct. Given you support them kind of, it would be nice to support them consistently, or loudly deny it. No one considers 1-arity Ring to be encouraging this pattern either, but the fact it works is a reasonable expectation. As this article illustrates, exceptions are great if they're used for exceptional cases. buddy.auth is written the way it is and I'm under the impression it's one of the most (if not the most) popular auth libraries for Ring. If you were looking for practical examples, here it is, even if not currently exhibiting the problem.

Database connections are generally pooled.

Pooled connections need a reliable release all the same.

Can you think of any other scenarios?

clj-http, probably the most popular Clojure HTTP library uses exceptions as control flow by default too. A contrived, yet practical example could be a proxy server.

(defn wrap-unavailable [handler service-name]
  (fn [q r _]
    (handler q r
      (fn [ex] (r {:status 500 :body (str service-name " is unavailable")})))))

(defn proxy-handler [match proxy-to]
  (fn [{:keys [uri]} r e]
    (if (clojure.string/starts-with? uri match)
      (clj-http.client/get (str proxy-to uri)
        (fn [response] (try (r response) (catch Throwable t (e t))))
        e)
      (r nil))))

(defn app-handler
  (compojure.core/routes
    (wrap-unavailable (proxy-handler "route-a/" "http://service.a/") "Service A")
    (wrap-unavailable (proxy-handler "route-b/" "http://service.b/") "Service B")))

Again, this example illustrates that developers may want to turn a raise into respond at any level in the stack, causing all above problems.

There have been no reports of problems that have occurred in real-world production systems.

Judging by the volume of online articles, the asynchronous variant is not too popular. Together with the fact that the standard middleware doesn't throw unless you fiddle with it, this doesn't surprise me. I would argue this doesn't make it any less of an issue, even if of a lower severity. Not unlikely the stability of some of those systems relies on respond non-throwing regardless of it not being guaranteed by the contract.

However, we cannot replace an existing system without some evidence that there problems in real-world use-cases.

Excuse me if I'm taking my time separating the CF proposal. As far as the scope of this report goes, the system doesn't need replacing, just patching. If middleware-injected logic handled its exceptions the problem would be alleviated. The hard part is making developers aware of this requirement, having them adjust existing libraries and not failing at it. The CF proposal is partly inspired by the inability to enforce this requirement (if it's a requirement at all)

weavejester commented 4 years ago

The question is specifically about exceptions bubbling from the callbacks. While the responsibility to raise exceptions coming from the code you control is understood, the responsibility to catch and raise exceptions coming from respond or raise is not that clear.

Ah, by so "callback exceptions" you meant specificially the respond and raise callbacks only, and you're proposing handling these via a separate mechanism to exceptions in callbacks created by the handler.

I can appreciate the reasoning behind that. In a synchronous model the respond and raise are outside the scope of the inner handler, and it's not unreasonable for the asynchronous model to follow the same logic.

But should this pattern be mandated? And what happens if only raise cannot throw an exception?

The buddy.auth middleware uses exception for control flow, but I'm cautious about encouraging this, as it's generally considered to be an anti-pattern.

So is callback hell, yet here we are.

You'll need to justify callbacks being generally considered an anti-pattern.

Pooled connections need a reliable release all the same.

Yes, but the acquire and release for pooled connections is typically brief; if you're using a pool, you tend to release the moment you're done with the connection. I wouldn't expect to see middleware releasing a pooled resource.

clj-http, probably the most popular Clojure HTTP library uses exceptions as control flow by default too.

Can you give an example? I know that clj-http throws exceptions by default on error responses, but that's not control flow unless those exceptions are expected.

Again, this example illustrates that developers may want to turn a raise into respond at any level in the stack, causing all above problems.

Breaking out exception handling of the proxy handler into its own middleware seems an odd design choice, but it's possibly the closest we've come yet to a realistic use-case.

As far as the scope of this report goes, the system doesn't need replacing, just patching. If middleware-injected logic handled its exceptions the problem would be alleviated. The hard part is making developers aware of this requirement, having them adjust existing libraries and not failing at it.

One possible course of action could be to add functions to make it easier to wrap exceptions in callbacks. I mentioned this earlier, but something applied specifically to middleware:

(defn catch-raise [raise callback]
  #(try (callback %) (catch Throwable t (raise t)))))

(defn wrap-example [handler]
  (fn [request respond raise]
    (handler request (catch-raise raise (fn [resp] (assoc resp ::foo 1))) raise)))

weavejester commented 4 years ago

It might be a good time for me now to take my "devil's advocate" hat off, as I think we've covered the main arguments of this discussion, and we're sinking into details that I don't think will affect the outcome. I'd like to summarize both the problem and my current position.

The problem occurs under the following circumstances:

Asynchronous handlers are used.
An exception is thrown by respond or raise outside the calling thread.
The raise callback is overridden by middleware that is not the outermost middleware.

This can result in an inner raise handling an exception from an outer respond or raise.

This seems a fairly rare problem in practice, and my current feeling is that this does not require replacing the existing asynchronous handler design in its entirity (e.g. with CF or similar).

However, I do think it justifies some helper functions, and for a stricter specification, as you convincingly argued for. So I'd like to propose the following:

A ring.util.async/raising function to automate raising thrown exceptions.
An update to the Ring 2.0 specification that mandates that the respond and raise callbacks cannot throw exceptions (they must be raised).
Changes to the corresponding middleware in Ring for both Ring 1 and Ring 2 handler types.

The raising function would look like:

(defn raising [raise f]
  (fn [& args] (try (apply f args) (catch Throwable t (raise t)))))

For example, middleware that changes the respond callback might now look like:

(defn wrap-content-type [handler options]
  (fn [request respond raise]
    (handler request
             (async/raising raise
               (fn [response]
                 (respond (content-type-response response request options)))
             raise)

Do you see anything I've missed?

mszajna commented 4 years ago

The problem occurs under the following circumstances:

Asynchronous handlers are used.

An exception is thrown by respond or raise outside the calling thread.

The raise callback is overridden by middleware that is not the outermost middleware.

This can result in an inner raise handling an exception from an outer respond or raise.

The symptoms observed depend on the error-handling strategy. So far we've discussed a handler that catches both, own exceptions and those that bubble from respond, and then raises. There's many strategies a developer could follow. Some ideas:

(defn simple-handler [q respond raise]
  ; Don't bother handling errors if you can prove your code doesn't throw.
  (future (respond i-never-throw)))

(defn precise-handler [q respond raise]
  ; Make sure to handle any failures on the handler side,
  ; then invoke callback with no safety nets.
  ; Roughly equivalent to what compojure.api does with its Sendables
  (future (if-let [result (try (risky-business) (catch Throwable t (raise t) nil))]
            (respond result)))

(defn catch-all-handler [q respond raise]
  ; Catch all unhandled exceptions and raise, as discussed so far
  (future
    (try (respond (risky-business))
      (catch Throwable t (raise t)))))

(defn handling-handler [q respond raise]
  ; Actually handle an exception
  (future
    (try (respond (risky-business))
      (catch ExceptionInfo e (respond (error-response e))))))

In principle, any code path that might invoke callbacks multiple times or fail to invoke any, is going to exhibit faulty behaviour in some way. Allowing respond and raise to throw makes reasoning about these code paths much harder.

handling-handler shows respond duplication is possible even without error-handling middleware. You could blame the user for poor scoping of try block but, in fact, precise scoping in Clojure is tricky.

In my opinion, the problem is rare in practice because middleware rarely throws. Most async handlers I've seen would be affected in some way if it did. I would argue Ring ought to either mandate callbacks non-throwing to make reasoning about handler code paths easy, or instruct users precisely how to handle errors/exceptions right. The former appeals to me much more and, the way I see it, is just formalising what's already a common practice.

Your proposal looks good. It's exactly what I was hoping to achieve. Thanks!

My current feeling is that this does not require replacing the existing asynchronous handler design in its entirity (e.g. with CF or similar).

The remaining shortcoming is the inability to enforce the rules imposed. It's arguable if it's worth introducing an alternative contract but I'm willing to take my shot in a separate GH issue.