pauljamescleary / scala-pet-store

An implementation of the java pet store using FP techniques in scala
Apache License 2.0
1.07k stars 211 forks source link

Error Handling #97

Closed nolledge closed 6 years ago

nolledge commented 6 years ago

Hi, thank you very much for this project. I had to look a while before finding a 'real life' example for a functional webservice and I am glad I have found one :) Currently I am asking myself where to put a more finegraded error handling. For example:

A repository algebra defines a function like this:

def get(orderId: Long): F[Option[Order]]

In this scenario the Option type just represents an order being present or not. But what about other errors that might occure? How would I handle network errors, constraint violations, parse errors and all of that stuff? Would you switch to an Either type? Is this part of the service layer and somehow has to be represented by the context F[_]? I am not sure where to handle those errors if, for example, I would like to return a BadGateway error instead of a generic InternalServerError.

pauljamescleary commented 6 years ago

Good question @nolledge, I think some of it is personal preference.

Broadly speaking, there are 2 kinds of errors. Errors you cannot do anything about and those you can.

For example, for simple apps you cannot do anything about a network error. Network is down, cannot function. In these cases errors tend to bubble all the way out.

For these "unpredictable" kinds of errors you cannot do anything about, they follow along in Monad in our case. As an example, cats effect IO has the ability to fail. We do not need to know specifics. If we need to raise or try and handle the error bubbling out, then the constraint we add is that F : MonadError

For errors that you can predict reliably (more like checked exceptions) and you want to handle, you typically will do that in an Either. In this sample application, I use EitherT for those kinds of errors.

If you look at an EitherT, we can have EitherT[IO, BusinessError, A] assuming BusinessError is some kind of known error type. The IO within it could fail unpredictably shortcircuiting your program. Also, Either could be a BusinessError also short circuiting.

I'll try to sum up my current thinking here and invite others to chime in (it is good discussion):

network errors For most use cases, cannot do anything about them. You can build in some kind of exponential backoff / retry mechanism, but you cannot retry forever, right? I use IO, and use MonadError to be able to raise or handle the error.

These tend to be at the edges of your system. Database calls, remote service calls, file system calls. Things that can just blow up because it was angry.

constraint violations We could be thinking many different things, but I consider "constraint violations" something that I know can fail. For these, I tend to use Either with a known error case (app specific).

For example, imagine that I have an Age type that represents a person's age and has a range of 0 - 200. I might model that something like...

import cats._, cats.data._, cats.implicits._

final case class private Age(value: Int) extends AnyVal
object Age {
  final case class InvalidPersonAge(value: Int)
  def apply(i: Int): Either[InvalidPersonAge, Age] =
    Either.cond(i > 0 && i < 200, new Age(i), InvalidPersonAge(i))
}

The good news is that now you cannot construct an invalid age (for the most part). The bad news is that you have to deal with that explicitly in the rest of your code.

parse errors It depends on what you mean. At times, I get a JSON packet in and want to run validations. I like to return all the errors, not just the first. For this case, I use Validated, or specifically ValidatedNel from cats as I want to aggregate all the errors. Then you can lift the result into an Either to fail and collect all the errors, or you can F.raiseError assuming F is a MonadError.

For all of these, I may use a type alias so I boil things down to a single type parameter F[_] only takes a single type param.

type ParseResult[A] = ValidatedNel[String, A]

Kind projector can help with some of that stuff.

Usually, you will have something like the aforementioned Age example. You can return Either, but you could also similarly return Validated instead. I tend to use Either, and allow the caller to do something like toValidated if need be, but that is personal choice.

Return a BadGateway Assuming you are using MonadError, you can adaptError. With ApplicativeError (MonadError extends this) you can do all kinds of fun things handleError, handleErrorWith, recover, attempt and more. In other words, you can inspect the error and do certain things.

Repository Errors In my "Repositories", they are only concerned with data access. I think of them as "HashMap". So something like def get(id: Long): F[Option[User]] is telling you:

Your application services / business logic then determines what to do with that info.

Sometimes, it is valid to get a None. Make sure user does not already exist.

The organization characteristics of the pet store follow loosely with hexagonal architecture / onion architecture / domain driven design. In there, you have:

Where you would have constraint validations, business rules, things like that would go in the domain in the models themselves, or other things like trait BusienssValidations. The important thing is that the things in the domain should be pure, free of things like IO or other effects. Just like the Age example above, it is pure, it does not know about a database, a service, HTTP, or anything else.

nolledge commented 6 years ago

Thank you for that excellent answer!

Just to sum it up for me: You keep the repositories return types free of errors that might occcure due to infrastructure issues etc and concentrate on the primitive logic of storing and requesting elements. Therfore basic datatypes like lists and options are used. Exceptions are allowed to propagate and can be handled when needed by a F[_] providing an instance of MonadError.

So assuming you have a Postgres implementation of a Repository. Would you catch those SqlExceptions and throw your own (to hide implementation details)? Until now I thought of throwing exceptions as something to avoid when trying to write more functional code. So I feel a litte intimidated by throwing them myself. But for my undertanding in the context of purity of a function it is ok to throw an exception (the function is then just not 'total' defined) but not to catch an exception and use it for flow control, because functions should just depend on their input parameters. Do you have the same understanding in this aspect?

pauljamescleary commented 6 years ago

No problem @nolledge.

"Would you catch those SqlExceptions and throw your own (to hide implementation details)?" I do not catch exceptions generally in the repository unless I can actually recover from them. For example, you can use REPLACE INTO in MySQL for a save method in order to do an update in the event the record already exists as opposed to attempting to catch a DuplicateKeyException and issue the update. If I could not recover, assuming F[_] : MonadError then you can F.raiseError with the original exception.

"So I feel a litte intimidated by throwing them myself. But for my undertanding in the context of purity of a function it is ok to throw an exception" I never throw ever ever ever ever ever ever ever ever (repetition for emphasis). That is what things like MonadError are for, to lift an exception into your effect type. I view throw as an abnormal termination of a program. Not sure if that is just my FP understanding or not, but throws are "hidden exits", like little land mines laying around your code waiting for something to blow up.

nolledge commented 6 years ago

Thank you for confirming that throwing an exception is actually a bad idea. So MonadError is used to abstract over the whole error handling mechanism and guarantees a F can 'handle' a certain error. But then again: a SQL client might signal with an exception that a table is not present and I want to return something different than an InternalServerError, maybe a bad gateway. Would I map those throwables in the repository implementation to my own throwables to hide implementation details? This seems to be the only choice when I want to avoid matching on specific throwables beyond the repository layer.

pauljamescleary commented 6 years ago

@nolledge a lot depends on your use case / application.

"a SQL client might signal with an exception that a table is not present" I have not encountered this. I use flyway to run migrations when the app starts up (the pet store does as well). Makes sure that the database schema is in a good place. If that fails, then the app doesn't start. If it succeeds, it means the tables (and indexes and stored procedures and users and everything) are present.

Here are the flyway migrations - https://github.com/pauljamescleary/scala-pet-store/tree/master/src/main/resources/db/migration

And here is where we run migrations (it is in Server, when the app starts up) - https://github.com/pauljamescleary/scala-pet-store/blob/c420b5947aead18416da7b6fa90d262b8b027037/src/main/scala/io/github/pauljamescleary/petstore/Server.scala#L30

If you have some kind of use case where you create tables on the fly in the app, then that is a little different, and handling it is somewhat subjective.

nolledge commented 6 years ago

Hi, thank you for your answer. Lets not focus too much on the SQL implementation. I was wondering where in general I would map an implementation specific error (like an SQL throwable) that might raise, to an error that my controller can pattern match to a HTTP status 503 or something. Because on controller level I do not want to match against implementation specific throwables. So to avoid implementation details beyond my repsoitory algebra I would somehow map those specific errors to some self defined errors.

The only location that seems appropriate for that is the implementation of the repository itself. Would you also use some kind of error mapping to hide implementation details?

pauljamescleary commented 6 years ago

@nolledge Assuming that your repository implementations can be pretty broad (say Riak, Postgres, MongoDB), then you really cannot anticipate all those up in your HTTP handling.

In that event, I would create an error hierarchy in my repository layer like

sealed trait RepositoryError
final case class UniqueConstraintViolation(message: String) extends RepositoryError

Then, you could at some point when you have your error do IO.raiseError or F.raiseError assuming you have a MonadError in scope.

Then, your error would bubble up to the Endpoint, and you can attempt to handle it, for example map it to your 503

nolledge commented 6 years ago

Thank you for your help!