coupergateway / couper

Couper is a lightweight API gateway designed to support developers in building and operating API-driven Web projects
https://couper.io
MIT License
85 stars 15 forks source link

generic backend error cannot be handled by backend error handler #707

Closed johakoch closed 1 year ago

johakoch commented 1 year ago

A dial backend error cannot be handled by backend error handler.

server {
  api {
    endpoint "/**" {
      proxy {
        backend {
          origin = "https://unkn.own"
        }
      }

      error_handler "backend" {
        response {
          status = 418
        }
      }
    }
  }
}
$ curl -si http://localhost:8080/anything
HTTP/1.1 502 Bad Gateway
Couper-Error: backend error
...

{
  "error": {
    "id":      "cfgcdujeg9s9ovgia6e0",
    "message": "backend error",
    "path":    "/anything",
    "status":  502
  }
}

with log:

ERRO[0002] backend error: anonymous_4_13: connecting to anonymous_4_13 'unkn.own:443' failed: dial tcp: lookup unkn.own on [...]: no such host [...]

I'd expect the backend error to be handled by the error handler.

Same with request:

server {
  api {
    endpoint "/anything" {
      request "r" {
        url = "https://unkn.own/"
      }

      proxy {
        backend {
          origin = "https://httpbin.org"
        }
      }

      error_handler "backend" {
        response {
          status = 418
        }
      }
    }
  }
}
johakoch commented 1 year ago

The error thrown (in innerRoundTrip(), handler/transport/backend.go) is an errors.Backend:

    beresp, err := b.transport.RoundTrip(req)
...
    if err != nil {
        select {
        case derr := <-deadlineErr:
            if derr != nil {
                return nil, derr
            }
        default:
            if _, ok := err.(*errors.Error); ok {
                return nil, err
            }

            return nil, errors.Backend.Label(b.name).With(err)
        }

In newErrorHandler() (config/runtime/error_handler.go) we seem to not expect "generic" super-kind errors to be thrown and so delete the superKind:

        if superKindMap, mapExists := errors.SuperTypesMapsByContext[ref]; mapExists {
            // expand super-kinds:
            // * set super-kind error handler for mapped sub-kinds, if no error handler for this sub-kind is already set
            // * remove super-kind error handler for super-kind
            for superKind, subKinds := range superKindMap {
                if skHandler, skExists := handlersPerKind[superKind]; skExists {
                    for _, subKind := range subKinds {
                        if _, exists := handlersPerKind[subKind]; !exists {
                            handlersPerKind[subKind] = skHandler
                        }
                    }

                    delete(handlersPerKind, superKind)
                }
            }
        }

So the configured backend error_handler is not suitable for generic backend errors anymore.

johakoch commented 1 year ago

There's one other instance of throwing errors.Backend (also in handler/transport/backend.go):

func setGzipReader(beresp *http.Response) error {
...
    var src io.Reader
    src, err := gzip.NewReader(beresp.Body)
    if err != nil {
        return errors.Backend.With(err).Message("body reset")
    }

@malud Should we create specific error kinds for these cases?

johakoch commented 1 year ago

Another similar case is in accesscontrol/ac.go:

func (i ListItem) Validate(req *http.Request) error {
    if err := i.control.Validate(req); err != nil {
        if e, ok := err.(*errors.Error); ok {
            return e.Label(i.label)
        }
        return errors.AccessControl.Label(i.label).Kind(i.kind).With(err)
    }

Here the error thrown is errors.AccessControl (another super-kind), if the specific Validate() method does not already return an errors.Error. Fortunately, all access control implementations do return errors.Errors in all error cases.

johakoch commented 1 year ago

We decided to create specific error types for the two backend cases. How should we name them?

Proposal:

johakoch commented 1 year ago

The dial error is thrown in handler/transport/transport.go:

func NewTransport(conf *Config, log *logrus.Entry) *http.Transport {
...
    transport := &http.Transport{
        DialContext: func(ctx context.Context, network, addr string) (net.Conn, error) {
...
            conn, cerr := d.DialContext(stx, network, address)
            if cerr != nil {
                host, port, _ := net.SplitHostPort(conf.Origin)
                if port != "80" && port != "443" {
                    host = conf.Origin
                }
                if os.IsTimeout(cerr) || cerr == context.DeadlineExceeded {
                    return nil, fmt.Errorf("connecting to %s '%s' failed: i/o timeout", conf.BackendName, host)
                }
                return nil, fmt.Errorf("connecting to %s '%s' failed: %w", conf.BackendName, conf.Origin, cerr)
            }

and is passed to the aforementioned (in handler/transport/backend.go)

func (b *Backend) innerRoundTrip(req *http.Request, tc *Config, deadlineErr <-chan error) (*http.Response, error) {
...
    beresp, err := b.transport.RoundTrip(req)
...
    if err != nil {
        select {
        case derr := <-deadlineErr:
            if derr != nil {
                return nil, derr
            }
        default:
            if _, ok := err.(*errors.Error); ok {
                return nil, err
            }

            return nil, errors.Backend.Label(b.name).With(err)
        }
    }

@malud Which other errors would be handled by the default case? Timeout errors are handled by case derr := <-deadlineErr, right?

johakoch commented 1 year ago

There' s another bug in the relevant code. Consider this configuration:

server {
  api {
    endpoint "/**" {
      proxy {
        backend {
          origin = "https://httpbin.org"
          timeout = "1ns" # <- causing backend_timeout error
        }
      }

      error_handler "*" {
        response {
          status = 204
          headers = {
            from = "*"
          }
        }
      }

      error_handler "backend" {
        response {
          status = 204
          headers = {
            from = "backend"
          }
        }
      }
    }
  }
}

The backend_timeout error should be handled by the "backend" error_handler, but is instead handled by the "*" error_handler. This is due to in

            for superKind, subKinds := range superKindMap {

having superKind == "*" before having superKind == "backend", and so the "*" error_handler registering for all the free positions before the "backend" error_handler can.