googleapis / google-api-go-client

Auto-generated Google APIs for Go.
https://pkg.go.dev/google.golang.org/api
BSD 3-Clause "New" or "Revised" License
4.01k stars 1.12k forks source link

OpenTelemetry traces misused by transport/http #1573

Open adg opened 2 years ago

adg commented 2 years ago

The transport/http package by default wraps the HTTP transport with an OpenTelemetry ochttp.Transport. That transport wrapper will record HTTP traces for a subset of all requests if OpenCensus tracing is enabled.

However, it is misconfigured. The Transport type has this field, which is unset google-api-go-client:

    // NameFromRequest holds the function to use for generating the span name
    // from the information found in the outgoing HTTP Request. By default the
    // name equals the URL Path.
    FormatSpanName func(*http.Request) string

The default is for the transport to name each span after the request URL path. In practice, this means a proliferation of differently named spans. For example, if you use the Google Cloud Storage library, you will end up with a span for every file you access. This is not how spans are supposed to be named - they should be named after a function or API endpoint, perhaps including some small, finite set of user-specified options.

This becomes pathological if you actually record the traces somewhere. For instance, if you have the OpenCensus zpages debug endpoints enabled (common in production systems), a sample of those traces will be stored in a local store. While old traces for a given span are purged, the spans themselves are never purged, and so a running process will accumulate traces indefinitely for every HTTP request path made by google-api-go-client. Loading the OpenCensus tracez page in such cases is pretty funny: a production service of mine had accumulated several gigabytes of traces, and its tracez endpoint served so much HTML that it crashed my browser.

I think the fix here is to specify a FormatSpanName function when setting up the ochttp.Transport. I'm not sure how exactly the spans should be named - probably in some service-specific way - but an immediate remedy would be to give all requests the same span name (google-api-go-client?).

adg commented 2 years ago

A related issue: https://github.com/googleapis/google-cloud-go/issues/1573

adg commented 2 years ago

Also related: https://github.com/census-instrumentation/opencensus-go/issues/1180

codyoss commented 2 years ago

This is a bit tricky... I think we could easily do as you suggest, but then there could be no aggregations on method level calls. In general most of our libraries are not nicely producing their own spans, so we rely on the naming to capture the context of the call. Making this change would force nearly all API calls to have the same span name. Do you think think is much better than many span names?(maybe it is)

For storage, on the reader side we do create custom spans. It seems to me that we should also do the same on the writer side, that would alleviate the example you provided at least.

If we did aggregate these I suppose people could always examine the http.path attribute for reporting, but I am not sure that is much better than what is there today in this case. Thoughts?

apesternikov commented 2 years ago

workaround that works for me:

import (
         htransport "google.golang.org/api/transport/http"
)
httpClient, _, err := htransport.NewClient(ctx, option.WithTelemetryDisabled())
    if err != nil {
        return nil, err
    }
    httpClient.Transport = &ochttp.Transport{
        Base: httpClient.Transport,
        Propagation:    &propagation.HTTPFormat{},
        FormatSpanName: tracingutils.FormatHttpSpanName,
    }
    client, err := storage.NewClient(ctx, option.WithHTTPClient(httpClient))

in tracingutils package:

func FormatHttpSpanName(req *http.Request) string {
    path := req.URL.Path
    if len(path) < 1 {
        return path
    }
    if path[0] == '/' {
        path = path[1:]
    }
    switch req.Host {
    case "storage.googleapis.com":
        path, _, _ = strings.Cut(path, "/")
    default:
    }
    return req.Host + "/" + path
}
adg commented 2 years ago

@codyoss

Making this change would force nearly all API calls to have the same span name. Do you think think is much better than many span names?(maybe it is)

I think it would be better for the Cloud APIs that have user-generated content in the paths. In our case we saw a memory leak of several GB in one of our production services because of this default. So perhaps this suggestion of yours is the way to go:

For storage, on the reader side we do create custom spans. It seems to me that we should also do the same on the writer side, that would alleviate the example you provided at least.

And perhaps also worth inspecting the other Cloud APIs beyond Storage to see whether they have user-generated content in the Paths. Hopefully it's just Storage writes that are a problem.