aws / aws-sdk-go-v2

AWS SDK for the Go programming language.
https://aws.github.io/aws-sdk-go-v2/docs/
Apache License 2.0
2.57k stars 619 forks source link

imds errors should be compatible with the smithy.APIError interface #1306

Open TopherGopher opened 3 years ago

TopherGopher commented 3 years ago

Confirm by changing [ ] to [x] below:

Describe the question I've gone through the error handling documentation, and I'm having some issue around detecting 403/404 status codes from the metadata service and S3. This is a snippet of the code from imds specifically - you should be able to replicate a 404 code by running this on a local environment without a mocked imds service:

    c := imds.NewFromConfig(awsConf)
    output, err := c.GetInstanceIdentityDocument(
        context.Background(), &imds.GetInstanceIdentityDocumentInput{}, func(options *imds.Options) {
            options.Retryer = retry.AddWithMaxAttempts(aws.NopRetryer{}, 1)
    if err != nil {
        var gaiErr *smithy.GenericAPIError
        var aerr smithy.APIError
        var rerr *awshttp.ResponseError
        var opErr *smithy.OperationError
        switch {
        case errors.As(err, &rerr) && rerr.HTTPStatusCode() == 404:
            // If the metadata service isn't available, then return
            // a dummy document. This is common when running
            // aws-runas in --ec2 mode
            awsMetadata = imds.InstanceIdentityDocument{
                Region:    awsConf.Region,
                AccountID: qaAWSAccountID,
            }
            return awsMetadata, nil
        case errors.As(err, &aerr):
            // If the metadata service isn't available, then return
            // a dummy document. This is common when running
            // aws-runas in --ec2 mode
            awsMetadata = imds.InstanceIdentityDocument{
                Region:    awsConf.Region,
                AccountID: qaAWSAccountID,
            }
            l.WithFields(logrus.Fields{
                "err":               err,
                "aerr.ErrorCode":    aerr.ErrorCode(),
                "aerr.ErrorMessage": aerr.ErrorMessage(),
                "aerr.ErrorFault":   aerr.ErrorFault(),
                "aerr.Error":        aerr.Error(),
                "aerr":              aerr,
                "errType":           "APIError",
            }).Error("Could not fetch instance identity document - APIError")
        case errors.As(err, &gaiErr):
            // If the metadata service isn't available, then return
            // a dummy document. This is common when running
            // aws-runas in --ec2 mode
            awsMetadata = imds.InstanceIdentityDocument{
                Region:    awsConf.Region,
                AccountID: qaAWSAccountID,
            }
            l.WithFields(logrus.Fields{
                "err":                 err,
                "gaiErr.ErrorCode":    gaiErr.ErrorCode(),
                "gaiErr.ErrorMessage": gaiErr.ErrorMessage(),
                "gaiErr/ErrorFault":   gaiErr.ErrorFault(),
                "gaiErr.Error":        gaiErr.Error(),
                "gaiErr":              gaiErr,
                "errType":             "GenericAPIError",
            }).Error("Could not fetch instance identity document")
        case errors.As(err, &opErr):
            // This is an unexpected error
            l.WithFields(logrus.Fields{
                "err":             err,
                "opErr.Operation": opErr.Operation(),
                "opErr.Service":   opErr.Service(),
                "opErr.Unwrap":    opErr.Unwrap(),
                "errType":         "OperationError",
            }).Error("Could not fetch instance identity document for an unanticipated reason as OperationError")
        default:
            // Not an operation error - may never hit default
            l.WithFields(logrus.Fields{
                "message": err,
                "note":    "We got an unexpected error",
            }).Error("Could not retrieve metadata about the EC2 instance")
        }
    }

This hits the OperationError block - Output:

level=error
msg="Could not fetch instance identity document for an unanticipated reason as OperationError"
err="operation error ec2imds: GetInstanceIdentityDocument, http response error StatusCode: 404, request to EC2 IMDS failed"
errType=OperationError
func=LoadAccountRelatedMetadata
opErr.Operation=GetInstanceIdentityDocument
opErr.Service=ec2imds
opErr.Unwrap="http response error StatusCode: 404, request to EC2 IMDS failed"

I might not be understanding how to work with errors here though. Do I check for OperationError, Unwrap, then use errors.As? Is there a concrete error type somewhere for checking for AccessDenied and NotFound errors?

jasdel commented 3 years ago

Thanks for reaching out @TopherGopher. The Handling Errors doc might help with this, and provide more details on how to handle errors with the SDK. There are areas that need to be filled out in those docs such as the specific error types that could be returned, and checked for.

The v2 SDK took a very different approach to error handing vs the v1 SDK. The errors returned by API operation calls are layered with the OperationError at the top. While you could use Unwrap at each layer to inspect the subsequent layer, it is not necessary. The errors.As utility will perform the walking automatically, and walk down each layer until it finds an error that matches the type, or interface specified. Generally you don't want to use errors.Is at all with SDK errors, as it checks for error values that are equal, not same type/interface.

For example, if you're only interested in if the error has a HTTP status code, you could check for either an error with a method of HTTPStatusCode() int, or check for the ResponseError type directly.

type StatusCodeError interface {
    HTTPStatusCode() int
}

var statusCodeErr StatusCodeError
if error.As(err, &statusCodeErr) {
    fmt.Println("got status code", statusCodeErr.HTTPStatusCode())
}
// import smithyhttp "github.com/aws/smithy-go/transport/http"

type respErr smithyhttp.ResponseError
if error.As(err, &respErr) {
    fmt.Println("got status code", respErr.HTTPStatusCode())
}

This pattern can be applied to any of the layered errors returned by the SDK. The biggest limitation to this pattern is that you cannot specify an interface that combines multiple error layers, e.g. HTTPStatusCode and OperationError's Operation name


If your application is interested in checking for API errors from a service response you can either check for the specific error or generic API error code and message via type, smithy.APIError. You'll probably never need to check against smithy.GenericAPIError directly. This type is a fallback type the SDK will deserialize the response error into for API errors returned by a service that the SDK does not know about, and does not have a generated type for.

TopherGopher commented 3 years ago

Hey @jasdel - Thanks for the response.

I linked the same error handling doc and included an implementation of catching the smithy.APIError and a different version of the ResponseError in my example code, so cool - we're on the same page with how to call them and my understanding of using Is/As is correct it sounds like. The key piece I was missing that you said in your response is that it's a smithyhttp.ResponseError, which wasn't in the documentation - I believe the one in the docs is github.com/aws/aws-sdk-go-v2/aws/transport/http as opposed to github.com/aws/smithy-go/transport/http It's confusing to me though that the imds 404 error would still not be recognized as either as a smithy.APIError or awshttp.ResponseError, and instead had to be the smithyhttp.ResponseError error, but I'll use this pattern moving forward.

Is there an error type for authentication failure specifically for all libraries? How does your team handle authentication errors?

jasdel commented 3 years ago

Thanks for the update.

The key piece I was missing that you said in your response is that it's a smithyhttp.ResponseError, which wasn't in the documentation - I believe the one in the docs is github.com/aws/aws-sdk-go-v2/aws/transport/http as opposed to github.com/aws/smithy-go/transport/http

Thanks for pointing that out. The aws/transport/http#ResponseError embeds the smithy-go version and should be used by the SDK all HTTP based response errors. The smithy-go version is supposed to be a generic ResponseError container, that could be use with other SDK clients not just AWS.

It's confusing to me though that the imds 404 error would still not be recognized as either as a smithy.APIError or awshttp.ResponseError, and instead had to be the smithyhttp.ResponseError error, but I'll use this pattern moving forward.

Thanks for pointing that out. I'd forgotten that the EC2 IMDS Client uses the smithy-go version of ResponseError. I think the intention here is that EC2 IMDS API does not send back a RequestID like the other AWS API do. Which was the reason for it being smithy-go version. Not having the RequestID is the distinction the SDK's version adds between two types.

I think we need to review if this should be returning the AWS version of ResponseError because, while the IMDS API does not provide a ResponseID, it still is probably within the scope of an AWS API client.

Is there an error type for authentication failure specifically for all libraries? How does your team handle authentication errors?

Authentication errors will generally be 403 HTTP status code, and no the SDK does not handle these. The SDK considers authentication errors to be terminal failures. This is for two reasons:

TopherGopher commented 3 years ago

Should I create a ticket to get the IMDS client to start supporting a smithy.APIError?

I'm also having additional issues trying to use type specific errors in SQS.

For example, I'm trying to do a SendMessage and confirm a queue does not exist:

    params := sqs.SendMessageInput{
        MessageBody: aws.String(body),
        QueueUrl:    aws.String("https://sqs.us-east-1.amazonaws.com/165297049243/TOPHER_I_DONT_EXIST"),
    }
    _, err := q.service.SendMessage(context.Background(), &params)
    if err != nil {
        var QueueDoesNotExist *types.QueueDoesNotExist
        var apiErr smithy.APIError
        if errors.As(err, &QueueDoesNotExist) {
                         // When a queue is not found, it should hit this block
             fmt.Println("I'M THE CORRECT ERROR")
                          panic(QueueDoesNotExist)
        } else if errors.As(err, &apiErr) {
                         // But this error is considered still to be an APIError.
                         fmt.Println("I'M AN API ERROR")
            panic(apiErr)
        }

You'll see that there is a block for checking errors.As(err, &QueueDoesNotExist) against *types.QueueDoesNotExist, but it doesn't hit this block. Instead, it hits the generic APIError block and returns the following:

I'M AN API ERROR
panic: api error AWS.SimpleQueueService.NonExistentQueue: The specified queue does not exist for this wsdl version.

Obviously, I can catch the apiErr and check the apiErr.ErrorCode() for AWS.SimpleQueueService.NonExistentQueue, but it seems like unpacking into types.QueueDoesNotExist should work based on what I'm doing in other libraries.

Is this a bug in types.QueueDoesNotExist specifically or am I somehow misusing it?

jasdel commented 3 years ago

Thanks for the update @TopherGopher. Lets keep this issue for the IMDS issues.

Though, I think we should create a separate issue for the SQS issue you found. This issue probably will impact all AWS SDKs that generate types, and unmarshalers for modeled errors.

It looks like the SQS API model is incomplete, or incorrectly modeling the errors that are returned. The SDK only knows about the QueueDoesNotExist (api model) "error code", where as it looks like the services is returning a completely different error code, AWS.SimpleQueueService.NotExistentQueue that the SDKs have no knowledge of. The API model will need to be updated in order for the SDKs to have knowledge of this error, and be able to deserialize it.

TopherGopher commented 3 years ago

Agree - let's work IMDS issue here - tldr; the IMDS service returns errors that are not recognized as smithy.APIErrors and are not parsable into awsHttp.ResponseError (github.com/aws/aws-sdk-go-v2/aws/transport/http), but rather are only recognized as smithyHttp.ResponseError (github.com/aws/smithy-go/transport/http) In my ideal world, we should be able to catch the errors coming from the imds service as a generic APIError

For SQS, I created https://github.com/aws/aws-sdk-go-v2/issues/1319 to track those issues.

github-actions[bot] commented 2 years ago

We have noticed this issue has not received attention in 1 year. We will close this issue for now. If you think this is in error, please feel free to comment and reopen the issue.

TopherGopher commented 2 years ago

Please keep this issue open.