census-instrumentation / opencensus-go

A stats collection and distributed tracing framework
http://opencensus.io
Apache License 2.0
2.06k stars 327 forks source link

ocgrpc server instrumentation drops stats tags from the passed OC context #1247

Open parabolala opened 3 years ago

parabolala commented 3 years ago

Please answer these questions before submitting a bug report.

What version of OpenCensus are you using?

v0.22.5

What version of Go are you using?

1.15

What did you do?

I wanted to add an extra label/tag to the ocgcrpc-generated server metrics: client ID, and export them with a custom view. See a sample gRPC+metrics server. In particular lines around clientIDValue.

package main

import (
    "flag"
    "fmt"
    "log"
    "net/http"
    "strings"

    "contrib.go.opencensus.io/exporter/prometheus"
    "github.com/golang/glog"
    "go.opencensus.io/plugin/ocgrpc"
    "go.opencensus.io/stats/view"
    "go.opencensus.io/tag"
    "golang.org/x/net/http2"
    "golang.org/x/net/http2/h2c"
    "google.golang.org/grpc"
    "google.golang.org/grpc/health"
    healthpb "google.golang.org/grpc/health/grpc_health_v1"
    "google.golang.org/grpc/reflection"
)

var (
    KeyGRPCClient, _ = tag.NewKey("client")
)

func GRPCServerViewsPerClient() []*view.View {
    res := []*view.View{}
    for _, v := range ocgrpc.DefaultServerViews {
        customView := *v
        customView.Name = fmt.Sprintf("%s_per_client", customView.Name)
        customView.TagKeys = append(customView.TagKeys, KeyGRPCClient)
        res = append(res, &customView)
    }
    return res
}

func isGrpcRequest(r *http.Request) bool {
    return r.ProtoMajor == 2 && strings.HasPrefix(r.Header.Get("Content-Type"), "application/grpc")
}

func main() {
    flag.Parse()
    if err := view.Register(GRPCServerViewsPerClient()...); err != nil {
        glog.Fatal(err)
    }

    grpcServer := grpc.NewServer(grpc.StatsHandler(
        &ocgrpc.ServerHandler{},
    ))
    healthServer := health.NewServer()
    healthpb.RegisterHealthServer(
        grpcServer,
        healthServer,
    )

    httpMux := http.NewServeMux()
    pe, _ := prometheus.NewExporter(prometheus.Options{
        Namespace: "",
    })
    httpMux.Handle("/metrics", pe)

    muxHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        clientIDValue := "foo@bar.com"
        ctx, err := tag.New(r.Context(),
            tag.Insert(KeyGRPCClient, clientIDValue),
        )
        if err != nil {
            glog.Errorf("Failed adding tag to context: %s", err)
            http.Error(w, err.Error(), 500)
            return
        }
        r = r.WithContext(ctx)
        if isGrpcRequest(r) {
            grpcServer.ServeHTTP(w, r)
            return
        }
        httpMux.ServeHTTP(w, r)
    })

    reflection.Register(grpcServer)
    listenAddr := "localhost:8000"
    glog.Infof("Listening on %s", listenAddr)
    log.Fatal(http.ListenAndServe(listenAddr, h2c.NewHandler(muxHandler, &http2.Server{})))
}

Then I send a grpc.health.v1.Health/Check() request and check the /metrics output.

What did you expect to see?

I expected the value for the KeyGRPCClient OC tag to show up in the ocgrpc view, i.e. get populated in the prom exporter hander as client="foo@bar.com" label

What did you see instead?

The value from the Go/OC context is not propagated to the ocgrpc context. The exported metrics have the label client=""

Additional context

I believe the problem is at https://github.com/census-instrumentation/opencensus-go/blob/e736602bcaf5ed1e2191a730813ac2bc7caed9eb/plugin/ocgrpc/server_stats_handler.go#L43, where the oc tag map in the context is overwritten with tag.NewContext() to populate it with the tags from grpc metadata.

Currently there's a NewContext() method in the tag module to overwrite the tags map and New() which is used with mutators to add/remove single value.

At a first glance the solution might be to add a new mutator to merge the provided map (similar to the one in New()) into the current context and use the mutator instead of tag.NewContext().

This looks somewhat related to #746 but this one seems to have an easy solution that doesn't introduce the mutability concerns.