golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.12k stars 17.68k forks source link

proposal: errors: add Errors function to undo Join #57358

Closed lestrrat closed 1 year ago

lestrrat commented 1 year ago

I propose to add errors.Errors (note: I don't like the name either, please think of it as a placeholder) function to handle errors that contain multiple errors.

Problem

Upon trying out go1.20rc1's new features around errors.Join, I could not immediately find out how one should access individual errors in sequence:

err := interestingFunc()
for err := range err.Errors() { // obviously does not compile: Errors does not exist
   // do something with err
}

This sort of sequential scan is necessary if the contained error does not implement a distinct type or its own Is(error) bool function, as you cannot extract out plain errors created by errors.New/fmt.Errorf using As(). You also would need to know what kind of errors exists before hand in order to use As, which is not always the case.

It can be argued that users are free to use something like interface{ Unwrap() []error } to convert the type of the error first, and then proceed to extract the container errors:

if me, ok := err.(interface{ Unwrap() []error }); ok {
  for _, err := range me.Unwrap() {
     ...
  }
}

This works, but if the error returned could be either a plain error or an error containing other errors, you would have to write two code paths to handle the error:

switch err := err.(type) {
case interface{ Unwrap() []error }:
   ... // act upon each errors that are contained
default:
   ... // act upon the error itself
}

This forces the user to write much boilerplate code. But if we add a simple function to abstract out the above type conversion, the user experience can be greatly enhanced.

Proposed solution

Implement a thin wrapper function around error types:

package errors

// Errors can be used to against an `error` that contains one or more errors in it (created
// using either `errors.Join` or `fmt.Errorf` with multiple uses of "%w").
//
// If the error implement `Unwrap() []error`, the result of calling `Unwrap` on `err` is returned.
// Only the immediate child errors are returned: this function will not traverse into the child
// errors to find out all of the errors contained in the error.
//
// If the error specified as the argument is a plain error that does not contain other errors,
// the error itself is used as the sole element within the return slice.
func Errors(err error) []error {
   switch err := err.(type) {
   case interface { Unwrap() []error }:
     return err.Unwrap()
   default:
     return []error{err}
   }
}

Pros/Cons

Pros:

Cons:

ianlancetaylor commented 1 year ago

CC @neild

seankhliao commented 1 year ago

The original proposal #53435 included Split, which was explicitly dropped. This proposal doesn't appear to bring with it much new information?

lestrrat commented 1 year ago

@seankhliao I suppose in terms of code, etc, this does not add much.

If anything please take this as "I actually needed it in the field, and interface { Unwrap() []error } was very unintuitive (especially in the presence of a more prominent errors.Unwrap)"

neild commented 1 year ago

Can you give a concrete example of where you needed this function?

Split was dropped from #53435 because we couldn't come up with sufficient examples for when it might be needed. Specific examples of when it would be useful would be informative.

lestrrat commented 1 year ago

@neild Thanks, here's some example. Hope it makes some sense.


The case I was handling was collecting data from multiple goroutines

graph LR
  A(User) --> B[Driver]
  B --> C1[Worker1]
  B --> C2[Worker2]
  B --> C3[Worker3]
  C3 --> | report error | B
  C2 --> | report error | B
  C1 --> | report error | B
  B--> |report error| A

In the old version we just did not bother to write User code that analyzes the different failure modes, because only one would be able to be sent back to the user anyways. But now that there was errors.Join, we would be able to change our code to report errors in finer detail to the user, such that the User can react differently to different errors.

It was very easy to create an error that contains all of the worker errors using it. Now it was time to write the example code to show how Users could react to the different errors. This is where we stumbled upon our initial problem. How does the User know which errors occurred? All errors were created using fmt.Errorf at this point.

Once we read a bit of the Go source code (because there wasn't any real examples in the user-visible docs for this sort of thing) and learned that the stdlib actually differentiates between a single error/unwrappable with single error/unwrappable with multiple errors, the first instinct was to look for a named interface like these, just like in io.Closer or io.ReadWriter:

// These were the things we looked for, I believe they don't exist in Go ATM
package errors

type Unwrapper interface {
  Unwrap() error
}

type MultiUnwrapper interface {
  Unwrap() []error
}

So that we could do:

if err := driverCode(); err != nil {
  switch err := err.(type) {
  case errors.MultiUnwrapper:
    for _, childErr := range err.Unwrap() {
      if childErr.String() == .... { // because errors haven't been made distinguishable yet
        ...
      }
    }
  default:
    // some other error
  }
}

But obviously this didn't exist, so now we were back to

if err := driverCode(); err != nil {
  switch err := err.(type) {
  case interface{ Unwrap() []error }:
      if childErr.String() == .... { // because errors haven't been made distinguishable yet... but at least it works as an escape hatch
        ...
      }
  default:
    // some other error
  }
}

The main issue we had at this point was not so much if it was possible to do it or not, but as providers of the Driver/Worker module, we thought that it would discourage users from using this module code once they see the cumbersome boilerplate code just to handle errors.

So then the next step in our search was to find a possibly more idiomatic way to achieve the same thing without resorting to type conversions. We found that errors.Is and errors.As worked with those errors that contain more errors, and tried writing code that way to see if it would result in less boilerplate. Upon some rudimentary tries, we found that the error types and/or the presence of a Is(error) bool method was important to extract errors that you care about. For example, this does not work:

func interesting() error {
  return fmt.Join(
    fmt.Errorf(`interesting error 1`),
    fmt.Errorf(`interesting error 2`),
  }
}

// Yes, I know this does not work :)
if err := interesting(); err != nil {
  var interestingErr error
  if errors.As(err, &interestingErr) {
     // handle specific case
  }
} 

Which means that we would have to rewrite the majority of our error cases with something equivalent to

type InterestingError struct {
   ...
}
func (err InterestingError) Error() string {
  ...
}

if err := interesting(); err != nil {
 j var interestingErr mypkg.InterestingError
  if errors.As(err, &interestingErr) {
     // handle specific case
  }
} 

And, even with all of this, we still end up forcing the end user to write boilerplate like this.

if err := driverCode(); err != nil {
  // there could possibly more error types to handle than one
  var interestingErr1 mypkg.InterestingErr1
  var interestingErr2 mypkg.InterestingErr2

  if errors.As(err, &interestingErr1) {
    // handle specific error
  } else if errors.As(err, &interestingErr2) {
    // handle specific error
  } else {
    // some other error
  }
}

We understand that boilerplate code is required nonetheless, but we didn't find that the presence of distinguishable types made the boilerplate any simpler.

Once we found this out, it just seemed more trouble than it's worth to convert our code to use errors.Join Instead what we dreamt was be able to reduce the boilerplate to something like below, which seemed a bit more concise and easy to understand

if err := driverCode(); err != nil {
  for _, child := range errors.Errors(err) {
     switch child := child.(type) {
     case *mypkg.InterestingErr1:
       // handle specific error
     case *mypkg.InterestingErr2:
       // handle specific error
     default:
       // some other error
     }
   }
}

While this is still boilerplate, one major advantage that we find is that we would be able to point users to the documentation of errors.Errors() in an explicit manner, instead of having to explain how Unwrap() error and Unwrap() []error works, etc for those people that don't really care why but just the how.

And finally, I understand that the last boilerplate breaks if the child could further be wrapped by another error type. But this was okay in our use case because we would tell the users what type of errors would be stuffed in the parent error.

earthboundkid commented 1 year ago

ISTM, this is blocked until #56413 is resolved, then it should just return an iterator. Suggested name errors.BreadthFirst.

neild commented 1 year ago

Thanks for the detailed explanation @lestrrat.

You suggest that you'd like users to write code like this:

for _, child := range errors.Errors(err) {
  switch child := child.(type) {
  case *mypkg.InterestingErr1:
    // ...
  }
}

This approach only works when err wraps the exact list of errors to examine. If err or one of the errors it contains is wrapped, then this will fail to properly detect the desired contained error.

For example, if we write a function that wraps the result of driverCode such as this

func f1(() error {
  if err := driverCode(); err != nil {
    return fmt.Errorf("additional context: %w", err)
  }
  // ...
}

then a single-level unwrapping of the error returned by f1 will not properly examine the contained errors.

To avoid the need to reason about the structure of the error tree, I would recommend using errors.Is or errors.As to look for the desired error instead, since these functions will recursively examine the entire error tree.

If there is a need to provide a specific list of errors to the user, I'd recommend defining concrete error type that contains that error. For example:

type DriverError {
  Errs []error
}

func (e DriverError) Error() string { return "some error text" }
func (e DriverError) Unwrap() []error { return e.Errs }

Users can extract the DriverError type using either a type assertion or errors.As to get at the list of contained errors.

lestrrat commented 1 year ago

@neild I think the main point of my argument got lost because of my bad example.

My main point was that errors.Is or errors.As still requires the users to know the concrete type of the errors (as in errors.As(err, &exactErrorType)) that they are dealing with, and are as bad as the boilerplate code that you specifically told me not to write. It's especially bad because it won't be able to "extract" simple errors created by things like errors.New() (var err error; errors.As(parentErr, &err) won't work for obvious reasons, leaving you no way to access this individual error)

In real like I can the user to enumerate, and do whatever they want to do with the contained errors, without having to explicitly name their error types

for i, err := range errors.Errors(err) {
   // do what you want, without knowing the exact type that err can take
   // maybe: log.Printf("error %d was %s", i, err), or 
  //                errCh <- err // feed individual errors somewhere
}
neild commented 1 year ago

In general, if you want to provide semantically meaningful errors to users, you need to either expose sentinel values (such as io.EOF) or concrete error types (such as os.PathError) in your API.

rsc commented 1 year ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

lestrrat commented 1 year ago

@neild I think you are right that that's the "correct" way to code. But I argue that there are at least two circumstances when I still want to examine individual errors

First: The code has not been coded correctly. I've done this myself, and I've seen plenty of cases where nobody foresaw that an error needed to be distinguished from another one using sentinel values or concrete types. Sometimes you just need to use an escape hatch.

func Interesting() error {
  ...
  return fmt.Errorf(`you will never need to distinguish this from other errors, would ya?`)
}

if err := Interesting(); err != nil {
   for _, err := range ExtractErrors(err) {
      if err.Error() == `you will never need to distinguish this from other errors, would ya?` { // escape hatch until next release
         ...
      }
   }
}

Second: You are debugging some error that you haven't seen before. I often sneak in a log.Printf("%#v", err) when I receive an unexpected error value. Only then do I know what sentinel value/concrete error type I would pass to errors.As.

func Interesting() error { ... }

if err := Interesting(); err != nil {
   for _, err := range ExtractErrors(err) {
      log.Printf("%#v", err)
   }
}

I apologize that my arguments have been shifting a bit, but all I really want to say throughout is that I'm pretty sure there's going to be needs in the wild to dissect the contained errors in a multi-error value without knowing the concrete error type/value beforehand.

rsc commented 1 year ago

It doesn't seem like there is consensus that we should add Split. We only just added Join. Perhaps we should wait for more use cases. If third-party users write and use the Split function, it will be easy enough to add later and will not invalidate any of the other copies.

rsc commented 1 year ago

Based on the discussion above, this proposal seems like a likely decline. — rsc for the proposal review group

rsc commented 1 year ago

No change in consensus, so declined. — rsc for the proposal review group

mitar commented 1 year ago

Hm, should simply errors.Unwrap traverse both joined and regular wrapped errors?

earthboundkid commented 1 year ago

Assuming iterators are added to a future version of Go, there could be a new proposal to add errors.Iter or whatever to walk the error tree if people can show that third parties are already using similar functions elsewhere (eg in popular error logging packages).

igorcafe commented 1 year ago

The naming is not that relevant, but the idea of having errors.Join without having errors.Split / errors.UnwrapList (whatever it is called) doesn't make much sense. errors.Join is currently an incomplete feature. Why would Join wrap errors if there is no way in the library for unwrapping it?

mitar commented 1 year ago

After my previous comment I have realized that joined errors have much more semantic meaning in the join itself than regular wrap. Regular wrap can be used for many reasons to annotate the original error (with a stack trace, additional data, messages, etc.) and so automatic traversal of that wraps make sense when you are searching for example for which layer of wrapping recorded a stack trace or which type the base error is. When joining errors, semantic if much clearer: those multiple errors happened and we are combining them now here. But reversing/traversing those joined errors automatically have much less sense: if you are searching for a stack trace, which one you want from all those errors? If you are searching for a base type, does it make sense to return if any of the errors have that base type?

So in my errors package I have made it so that unjoining is a very different action than just simple linear unwrapping.

neild commented 1 year ago

Why would Join wrap errors if there is no way in the library for unwrapping it?

An error wrapped with Join may be unwrapped with errors.Is or errors.As.

If Is/As aren't the right operation on the wrapped error, then Join is probably the wrong tool for the job. If you need to return a list of errors to a caller, then returning []error or a named error type that contains an []error is going to be the better choice. (See https://github.com/golang/go/issues/57358#issuecomment-1360085591 for some more details.)

mitar commented 1 year ago

@neild I think that the fact that errors.Is and errors.As traverse also Unwrap() []error is a mistake for two reasons:

Maybe errors.UnwrapMany should be introduced, with []error as return type, returning []error{err} when there is Unwrap() error and returning errors from Unwrap() []error when that exist. But I think in fact that errors.Is and errors.As should simply not traverse joined errors automatically, but one should first do errors.As on Unwrap() []error interface, and then manually decide how to proceed next.

But it is true that one can currently already search for Unwrap() []error using errors.As if they care about the order of traversal. So probably there is no need to change the existing and released behavior. Maybe just a note about this in documentation?

neild commented 1 year ago

53435 contains fairly extensive discussion on the rationale for the interactions between Join, Is, and As. The purpose of Join is to combine errors in a way that Is and As can inspect. If you don't want that behavior, then you don't need or want Join.

In particular, see the questions "Why should this be in the standard library?" and "Why is there no function to unwrap a multiply-wrapped error?" in https://github.com/golang/go/issues/53435#issuecomment-1191752789.