gleam-lang / gleam

⭐️ A friendly language for building type-safe, scalable systems!
https://gleam.run
Apache License 2.0
17.96k stars 749 forks source link

Unified error type #890

Closed lpil closed 2 years ago

lpil commented 3 years ago

This issue will be updated as time goes on to reflect the current state of this work

Currently Gleam provides a Result type that represens success or failure, but no generic error type. Gleam users have to decide on a case-by-case basis what error type to use with Result.

In applications where a high amount of control and information is desired in error types (for example, in a compiler) custom types can be used effectively.

In most programs this level of detail is not required, and instead convenience and ergonomics is preferred. This has resulted in many Gleam programs using String as the error type. This is highly convenient, but lacks all but the most basic detail of the error and as a result is not a desirable pattern to become convention.

If we design a convenient and generically useful error type and supply it in the standard library it can serve as a good default error type for the majority of Gleam programs.

Other languages

Here is a look at what other non-exception using statically typed languages use for error types.

Elm

Elm has no unified error type, instead they define custom types or use strings as Gleam programmers do today.

Elm almost entirely runs in the browser within a very tightly controlled and pure sandbox, so error handling may be less needed in Elm than most other languages.

OCaml

https://ocaml.org/learn/tutorials/error_handling.html#Result-type

The OCaml website recommends 4 options for error types:

Polymorphic variants solve the problems of ad-hoc definitions as they don't need to be declared, and they are composable as two polymorphic variant sets can be merged into a unison of the two possible variants. They however do not offer a convenient way to add additional context or wrap the error, limiting the diagnostic information.

https://keleshev.com/composable-error-handling-in-ocaml

Rust

TODO

Haskell

TODO

Go

https://blog.golang.org/go1.13-errors

TODO

Features

Definition

For many applications it would be convenient to be able to define ad-hoc errors rather than having to declare an error type for each type.

pub fn head(list) {
  case list {
    [x, ..] -> Ok(x)
    _ -> result.error("Cannot get first element of empty list")
  }
}

Wrapping with context

One of the main drives here is to be able to create a meaningful backtrace like structure of application meaningful information about the error, enabling the errors to be printed like so:

error: Unable to save changes to profile

cause:
    1. Profile service call failed
    2. Could not authenicate with profile service
    3. Got unexpected HTTP status 403

Or alternatively it could be written in this fashion:

error: Got unexpected HTTP status 403

context:
    1. Calling profile service
    2. Authenticating with profile service
    3. Performing HTTP request

These layers of contextual information can be added using a function similar to result.map_error that knows specifically how to add a layer to an error value.

try = db.run(sql, [user_id])
  |> result.context("Could not load user")

This function would replace the existing map_error calls used to wrap error types in higher level error types, though it requires no custom type definition and could be omitted if the programmer doesn't desire this context, reducing boilerplate.

Identifying specific errors

One thing that is very easy to do with today's custom error types is pattern matching

case result {
  Ok(x) -> ...
  Error(CouldNotAuthenticate) -> ...
  Error(NetworkError(detail)) -> ...
  Error(PermissionDenied) -> ...
}

With a generic error type this is no longer possible, instead a predicate function would need to be used.

the_error_type.includes(error, "Could not authenicate with profile service")

This relies on a magic string and as such is brittle. If the error detail changes then the string no longer matches, and the type system cannot catch this problem. This is the approach commonly used in Go.

This is difficult to solve as unlike Rust Gleam doesn't have interfaces, so we don't have a way to perform subtyping or downcasting to specific error types.

We may want to encourage libraries to expose APIs that use custom error types, and for them to provide a function that converts them into a generic error as desired.

Reporting

We would provide a function to print errors using the format above, and also would provide accessors to the data so that programmers can render the data in formats of their design.

Adding metadata

When defining an custom type to use as an error the programmer can add as many fields as desired containing additional information.

As Gleam lacks interfaces it is difficult to do this- each level in the chain must be the same, and we are unable to downcast to specific error types like Go and Rust can. In practice this is not done frequently in these languages, but it is very useful when needed.

Naming

The error type

The generic error type to be used in Result(value, TheErrorType)

Alias for Result(value, TheErrorType)

So that users do not have to type this so frequently.

Notes

@yaahc's recent talk provides lots of great context and information on error handling. The talk focuses on Rust so she talks about APIs we cannot make use of in Gleam due to our lack of interfaces, but otherwise it is an excellent resource. https://www.youtube.com/watch?v=rAF8mLI0naQ

CrowdHailer commented 3 years ago

Does this require language support?

Asking because it's on the language repo. My thought was that this could probably be built entirely in the standard library.

Is there a way to implement this in a progressive way?

Is there a way to implement this as not a big bang release, can smaller parts be got out the door first? What is the most important piece to start with?

My opinion on this would be that adding a source to the error type would be most useful. i.e. being able to see

0: Could not load user
1: Failed to connect to database

This could actually be done, simply by having the error type as a list of strings. not a long term solution but maybe a useful convention to start with.

try = db.run(sql, ...)
  |> result.wrap_error("Could not load user")

where

pub fn wrap_error(result, message) {
  map_error(fn(previous) { [message, ..previous]})
}

Who are the consumers of this error? Developer/Software user

Is the Goal of this project to make debugging Gleam projects easier, i.e. to surface information in the error at places like assert Ok(value) = do_some_work() Or should this error also be expected to be castable into 4xx/5xx server responses for clients?

CrowdHailer commented 3 years ago

Also how important is performance?

at the moment list.head returns {error, nil} does changing nil to something containing source information effect performance more than we want.

lpil commented 3 years ago

Does this require language support?

No I don't think it does. I'm posting here for visibility as this is possibly going to be very impactful if we implement it.

Is there a way to implement this in a progressive way?

I think would like to use an opaque type for this so we can carefully control the API exposed. This will have the drawback of not being able to use the errors in constants, which is something I would like to resolve in some fashion.

This could actually be done, simply by having the error type as a list of strings. not a long term solution but maybe a useful convention to start with.

This is somewhat what I am thinking to start.

Is the Goal of this project to make debugging Gleam projects easier, i.e. to surface information in the error at places like assert Ok(value) = do_some_work() Or should this error also be expected to be castable into 4xx/5xx server responses for clients?

The former for sure, though I am also interested in the latter. The matter of introspection and flow control based upon different kinds of errors is a big question.

Also how important is performance?

I shouldn't think constructing a record holding a list will be much of a performance impact, it's a very fast operation. It would be similar to Elixir's exception construction.

CrowdHailer commented 3 years ago

Proposal Level 1,

We've been using the phrase Level 1 at work to remind ourselves that level 2 probably exists so the question is not is this as good as it can be but is it better than what we have right now. I suspect there are several more levels to this.

type Report = String;
type NewResult(a, e) = Result(a, tuple(e, List(Report));

fn wrap_error(result, new_error, mapper) {
  case result {
    Ok(value) -> Ok(value)
    Error(tuple(current_error, sources)) ->
      Error(tuple(new_error, [mapper(current_error), ..sources]))
  }
}
  1. Report is the standard type for error information it is what you get after a specific error has been cast to something that can be printed send to a logger. As there are no traits we need this target shared type. For now I have defined it as a String but potentially it could be richer, although the majority of usecases for the report are debug information so String is a pretty good start,
  2. The error half of a result is now tuple(e, List(Report) This is essentially a non empty list. (A failed result with an empty list of errors would be meaningless) The first element is a type specific to the function that returned the result. i.e. not yet transformed to a result to make it easy for consumers to act upon it. The list of reports is the Error's sources, these have been transformed to the general report type because acting on them is now not necessary, they are only for debug information
  3. wrapping and error transforms the current error to its standard Report/String representation and add's it to the list of sources, doing this requires a mapping function to turn the current error to it's report to add it to the list of sources.

Some usage.

type QueryError {
  ConnectionError
  ConstraintError(constraint_name: String)
};

// also sql syntax error, table/field not found.
pub fn query_error_to_report(query_error) {
  case query_error {
    ConnectionError -> "Failed to connect to database"
    ConstraintError(name) -> "Constraint ${name} Errored"
  }
}

// This function matches on the library specific error type to implement retry logic
fn run_sql_with_retries(sql, retries) {
  case pgo.query(sql), retries {
    Error(tuple(ConnectionError, _sources)), 0 ->
      run_sql_with_retries(retries - 1)
    result -> result
  }
}

type AppError {
  DatabaseError
  ClientError
};

fn fetch_users() {
  let sql = "SELECT * FROM users"
  // after calling the function with our retry logic this function wraps the error as one of our application errors, and the database information becomes part of the list of sources/reports
  try rows =
    run_sql_with_retries(sql, retries: 5)
    |> wrap_error(
      top_error: DatabaseError,
      to_report: pgo.query_error_to_report,
    )

  try users = list.try_map(rows, row_to_user)
    |> wrap_error(
      top_error: DatabaseError,
      to_report: pgo.query_error_to_report,
    )
}

// Both the try calls here are using the App error, with all the error report information after it.
pub fn run() {
  try secret = os.get_env("SECRET")
      |> wrap_error(
      top_error: ConfigError,
      to_report: os.env_error_to_report,
    )
  try user =  fetch_users()
  Ok(We finished)
}

Performance, I don't know what we feel about performance, I chose the above structure to make an error with no information as small as possible. e.g. the error return value from list.head could be Error(tuple(Nil, [])) However because there is no "to_error_report" trait or similar mechanic. When wrapping an error you have to provide the function that turns the current error into a report version.

A simpler API might be to have the sources include the current error already formatted. e.g.

list.head([])
// => Error(tuple(Nil, ["Empty list doesn't have head"])

This would mean the error string would always be produced. However if we consider the error case to be mostly edgecases that might not be a problem.

Final thought. Span traces

Error sources give very similar information to span traces. If span tracing existed you might not want to worry error sources.

If that is the case I think that working out an ecosystem wide way to handle span information should come first. My reason for this is span information and context information would be just as useful for logging as it would be for errors.

CrowdHailer commented 3 years ago

Interesting talk on errors in go.

https://www.youtube.com/watch?v=IKoSsJFdRtI

talks about stacktraces

Instead they have a wrap function that the selectively use to add metadata, reminds me of span trace in rust talk shared earlier

RFC 7807 Problem Details for HTTP APIs

https://tools.ietf.org/html/rfc7807

lpil commented 3 years ago

Good finds, thank you @CrowdHailer !