Why is `InternalError Text` insufficient?

jezen commented 6 years ago

The explanation is helpful, but I'm trying to understand the motivation. What information does the actual exception contain that the InternalError Text value does not?

pbrisbin commented 6 years ago

Thanks for the question, happy to explain!

First of all, it's worth noting that this is just the default usage suggested by the example. You can absolutely use an errorHandler-based approach if you want. It might look something like this:

newtype AppHandlerError = AppHandlerError Text
  deriving Show

instance Exception AppHandlerError

instance Yesod App where
  errorHandler (InternalError msg) = do
    -- ...
    liftIO
      $ notifyBugsnag settings
      $ toException $ AppHandlerError msg

And if someone felt like (testing and) PRing such an example, then I'd be happy to merge it.

What information does the actual exception contain that the InternalError Text value does not?

Ideally, for any actual production application that cares about security, it shouldn't contain the same information at all. You don't want to leak raw error messages to your users (I've seen errors like "Error: unable to decrypt that thing with key SUPERSECRET", I kid you not), so something should already be rescuing the actual exception, logging it, and replacing it with something opaque, like InternalError "Server error".

Unfortunately, most Haskell web applications (IME) don't consider this (it's arguably a bug how the current yesod scaffolds work in this regard), but I'd rather this library be geared towards the better behavior anyway. That said, even if we ignore that fact and assume the two values are the same information, just encoded differently:

I'm trying to understand the motivation

In short:

Having a value of type Exception e => e, unlocks things you almost certainly want:
- A distinct errorClass reported to Bugsnag to avoid everything being just a single error
- Ability to know the constructor, e.g. to know that ErrorCall may have HasCallstack information worth looking for
- Ability to use fields of the record (e.g. sqlState) to change how you report (e.g. for grouping)
- Ability to do all this for library-local exceptions, because you can use fromException to re-cover any in-scope types that this library may not know about (see #26)
To recover (1) from InternalError Text would require someone (you or I) parse the textual message, which is additional logic somewhere that's not strictly needed since this value was already an exception, and is handle-able at a different point in the stack anyway

Hopefully that clarifies things. If you can spot anyway to simplify but still retain all (or most) of the benefits, please do let me know!

jezen commented 6 years ago

First of all, it's worth noting that this is just the default usage suggested by the example. You can absolutely use an errorHandler-based approach if you want.

Taking an opinionated approach is fine. I'd argue this is what we're sorely lacking in the Yesod world — a large repository of best practices for more mundane work.

Ideally, for any actual production application that cares about security, it shouldn't contain the same information at all.

So if I understand you correctly, we're relying on every developer's personal discipline that they don't show a bare exception message in the error handler response, and the middleware approach is an extra safeguard to ensure the sensitive exception message never makes it that far. Is that right?

I tried printing a bare exception a couple of days ago, and IIRC I couldn't see anything in there that wasn't already in the exception message. Maybe it's because I only did something like error "foo" in a Hander Html, and I need to try it with something that has a hasCallstack.

pbrisbin commented 6 years ago

So if I understand you correctly, we're relying on every developer's personal discipline that they don't show a bare exception message in the error handler response

I wouldn't put it that way. This library "works" regardless, we're not relying on anyone doing anything. I'm only saying this:

A responsible developer should be careful not to leak sensitive information by way of user-facing 500 pages
In applications where that is the case, said developer would probably want to use the middleware-based approach in this library, because that's a convenient place to see the errors before they're sanitized for users

If said developer is:

Just not being so responsible (which can be a perfectly valid choice in some contexts), OR
They are sanitizing in such a way that it happens after errorHandler, AND
They don't care about all the other benefits I outlined...

Then they can 100% use the errorHandler-style approach for this library. I hope the library's API is simple enough generally that such usage is easy to figure out.

and the middleware approach is an extra safeguard to ensure the sensitive exception message never makes it that far

I wouldn't put that that way either. All I can say is this:

Using the middleware approach won't do anything to protect someone who doesn't also take steps not to leak information in 500 pages
Using an errorHandler approach may not work for someone who also took the steps not leak information in 500 pages, depending on how they've done it

I tried printing a bare exception a couple of days ago, and IIRC I couldn't see anything in there that wasn't already in the exception message

Let's try to get concrete. Here is an application that is being "irresponsible" in the way I'm describing:

-- All stock Yesod stuff, plus this:

getSomeRouteR :: Handler Html
getSomeRouteR = do
  settings <- asksYesod appSettings
  response <- callSomeApi apiKey settings

  -- rest of Handler logic

And let's assume the API calls are done using http-client. By default, it raises exceptions for HTTP errors. And those error messages are super rich and include the HTTP headers. Now all your user has to do is trigger an API error and they'll see this in your 500 page:

Internal Server Error

HttpException ...
  { requestHeaders:
     [ ...
     , ("Authorization", "Your production API key to some service they can now use")
     ]
   }

Oops.

So the way to solve this is by adding code somewhere in your stack that rescues and logs exceptions, and replaces them with an opaque message before they get to users.

pbrisbin commented 6 years ago

Heads-up: we can continue this discussion as much as you'd like, but I'm going to close the issue since it's not any kind of actionable problem IMO.

jezen commented 6 years ago

FYI : My questions aren't totally specific to this library; I'm sure this library works very well ☺️ I'm using Rollbar (which is roughly equivalent to Bugsnag) in a couple of my Yesod projects, and a lot of what we're discussing transfers directly because we're talking mostly about Yesod.

Thank you for sharing the concrete example with the potential API key leak. When I said I didn't see anything in the exception that wasn't in the exception message, I had in mind data that could be sent to Rollbar/Bugsnag which would aid in tracking down the source of the error.

Your example leads me to a follow-up question: If the scenario you described occurred with code like this:

  errorHandler err@(InternalError e) = do

…would e contain the rich error messages with the headers included?

Thanks also for taking the time to explain this so clearly. This will no doubt be invaluable knowledge for people in the future who are also looking for the right way to handle exceptions in Yesod.

pbrisbin commented 6 years ago

…would e contain the rich error messages with the headers included?

Yes, that's exactly the problem. You can see right where this happens too:

https://github.com/yesodweb/yesod/blob/ce0c6976593261e9895b055c46b4f2d655e9d0a5/yesod-core/Yesod/Core/Internal/Response.hs#L103-L104

pbrisbin commented 6 years ago

And you can see where that (perhaps secrets-containing) value is shown directly to users in defaultErrorHandler:

defaultErrorHandler (InternalError e) = do
    $logErrorS "yesod-core" e
    selectRep $ do
        provideRep $ defaultLayout $ defaultMessageWidget
            "Internal Server Error"
            [hamlet|<pre>#{e}|]
        provideRep $ return $ object ["message" .= ("Internal Server Error" :: Text), "error" .= e]

pbrisbin commented 6 years ago

I had in mind data that could be sent to Rollbar/Bugsnag which would aid in tracking down the source of the error

This is worth calling out. There's two points on the spectrum about how secure you want to be:

Users can't see secrets leaked through exceptions
3rd parties such as Bugsnag (or LogDNA) can't see secrets leaked in exceptions (or log messages)

I've been speaking only about (1), and taking steps to 100% ensure that won't happen (such as not rendering the raw e in InternalError e on the 500 page). That's sort of bare minimum IMO for any real application with real users.

I will typically attempt to abide by (2) in ways that aren't 100% robust. I will avoid logging sensitive data and I will try to sanitize things in a one-off, targeted way (such as stripping any Authorization header from HttpExceptions that I'm about to submit to Bugsnag), but I understand that I may mess up and leak some data to these 3rd parties. I'm personally comfortable with that, but to each their own.

pbrisbin / bugsnag-haskell

Why is `InternalError Text` insufficient? #27