signalfx / signalfx-go

Go client library and instrumentation bindings for SignalFx
https://www.signalfx.com
Apache License 2.0
14 stars 51 forks source link

Errors from server should be wrapped for signal-fx client consumer to access #148

Closed adityavit closed 5 months ago

adityavit commented 2 years ago

Currently an error received from the server doesn't wrap the exact error for failure. Which makes it difficult for the users of the client library to take appropriate actions based on the type of the error.

Example of error received while deleting a muting alert in the past. Code: https://github.com/signalfx/signalfx-go/blob/master/alertmuting.go#L60

Unexpected status code: 400: {
  "code" : 400,
  "message" : "Cannot delete alert muting in the past."
}

Here the expectation is to wrap the error as a part of the returned error (https://go.dev/blog/go1.13-errors). So that it is easier for the users of the library to make conditions like.

if errors.Is(err, signalFx.ErrMutingAlertInPast)
san-san commented 2 years ago

Is there an update on this?

pellared commented 2 years ago

@adityavit @san-san I started looking into this.

Would it help (and be good enough) if we return an error with a structure more-or-less like the one below?

type APIError struct {
   StatusCode int
   Message string
}

Then the caller could use errors.As to decode the Message e.g. like this:

if err := client.DeleteAlertMutingRule(ctx, name); err != nil {
  var e *signalfx.APIError
  if errors.As(err, &e) {
    fmt.Println("API error message:", e.Message)
  } else {
    fmt.Println("Unknown error:", err)
  }
}

This error could be used in all signalfx.Client methods.

Right now, the SignalFx API endpoint returns only status code and error text message in case of an API error.

The messages are not documented so I would not assume that they are stable. Therefore, at this point of time it would be not safe to add errors like ErrMutingAlertInPast. However, I feel it is rather unlikely that the existing error messages are going to change.

SignalFx API docs: https://dev.splunk.com/observability/reference/api/incidents/latest#endpoint-delete-single-muting-rule

adityavit commented 2 years ago

Thanks for your response. The condition, we are trying to avoid in our code because of the current way the errors are returned is given below.

mutingRuleDeletedInPastErrMsg = "Cannot delete alert muting in the past"
err := s.client.DeleteAlertMutingRule(mutingData.ID)
if err != nil {
if strings.Contains(err.Error(), mutingRuleDeletedInPastErrMsg) {
            // Take action as rule doesn't exist anymore.
        }
       // Capture other error condition     
      }
}

I am not sure the solution provided above, will be able to help capture such specific errors? Users of library should be able to check specific errors, without checking for strings in the error message.

pellared commented 2 years ago

I am not sure the solution provided above, will be able to help capture such specific errors?

It will work, but I cannot guarantee that the error responses will not change in future.

Users of library should be able to check specific errors, without checking for strings in the error message.

I agree. However, at this moment it is not possible. There is no contract in the backend API regarding error responses other than HTTP status codes.

pellared commented 2 years ago

Currnetly, I think you can check if the alert is from the past by calling client.GetAlertMutingRule(mutingData.ID) and comparing the StopTime value. Would it help you?

EDIT: Moreover, I see you already have the mutingData value so I expect you could even do it before calling DeleteAlertMutingRule.