DataDog / dd-trace-go

Datadog Go Library including APM tracing, profiling, and security monitoring.
https://docs.datadoghq.com/tracing/
Other
653 stars 436 forks source link

contrib/internal/httptrace: proposal: Add option to ignore setting errors for 5xx HTTP status codes #2390

Open kazukousen opened 10 months ago

kazukousen commented 10 months ago

Only errors located in the uppermost service span are processed by ErrorTracking.

We use tracer.SetTag(ext.Error, err) to record error in the uppermost service span.

However, the gorilla/mux tracer we use which internally uses http tracer, also overrides the error with tracer.SetTag() when the HTTP Errors (5xx) occur. https://github.com/DataDog/dd-trace-go/blob/6becedc3e7d4c8da52788409123a61f2e07c316a/contrib/internal/httptrace/httptrace.go#L74-L76

As a result, ErrorTracking groups all errors into a generic 5xx error issue, displays examples like 500: Internel Server Error in its web UI. image

We'd like discuss about adding an option to ignore setting the error. this could be implemented as follows:

if !cfg.disabledSetHTTPError && status >= 500 && status < 600 {
    s.SetTag(ext.Error, fmt.Errorf("%s: %s", statusStr, http.StatusText(status)))
}
kazukousen commented 9 months ago

I worked on #2432

@ajgajg1134 @katiehockman Hi, folks. I would be grateful if you could take some time to review it.

katiehockman commented 8 months ago

Hi @kazukousen, thanks for filing this issue. We've been discussing this, and wonder if we should instead verify that there is not already an error status code and error tag on the span before we overwrite it. We're not sure that there is a good use case for overriding it, so rather than make this configurable, we should probably just fix this. Let us know what you think, and if this would fix your issue.

We'll work on this and add some tests, and get back in touch with you.

katiehockman commented 8 months ago

We're discussing adding a Tag(key string) interface{} method to the Span struct, which given a key would return the tag value. This would allow us to check if there is already an error tag here, and if there is, either 1) don't change the error or 2) concatenate the error so it doesn't get lost. That will also avoid needing additional configuration.

kazukousen commented 8 months ago

Hi @katiehockman, Thank you for your reply.

We've been discussing this, and wonder if we should instead verify that there is not already an error status code and error tag on the span before we overwrite it. We're not sure that there is a good use case for overriding it, so rather than make this configurable, we should probably just fix this. Let us know what you think, and if this would fix your issue.

That's make sense for us. It is clear. Thanks!