census-instrumentation / opencensus-go

A stats collection and distributed tracing framework
http://opencensus.io
Apache License 2.0
2.05k stars 327 forks source link

plugin/ocgrpc: add B3 trace propagation #666

Open terinjokes opened 6 years ago

terinjokes commented 6 years ago

In February, support for B3 propagation was added for HTTP clients and servers in #380. It is common in B3 environments to also use these same headers in GRPC metadata, as described in b3-propagation.

I'm keen to try using OpenCensus in my project, but I need to maintain tracing compatibility with other services over both HTTP and GRPC. At this time, the ocgrpc plugin does now allow for alternative propagation formats.

rakyll commented 6 years ago

/cc @bogdandrutu

rakyll commented 6 years ago

gRPC uses the propagation format we implement in OpenCensus and even though tracing is configurable in some language implementations of gRPC, it is not in some languages like Java. If we allow the Go implementation to use B3, it won't be compatible with gRPC Java.

This might be a problem if you are depending on a sidecar or load balancer that only understands gRPC B3 but shouldn't be a problem otherwise.

You can always propagate traces in the binary format among gRPC servers and use our HTTP propagation formats when propagating on HTTP. Inside the process, we convert the headers/metadata to trace.SpanContext anyways. How you propagate on wire is not that important unless you are intercepting network and trying to interact with tracing headers/metadata at that level.

terinjokes commented 6 years ago

This might be a problem if you are depending on a sidecar or load balancer that only understands gRPC B3 but shouldn't be a problem otherwise.

We do this, as well as communicating over GRPC to services that have no knowledge of OpenCensus, but expect B3 propagation over GRPC.

rakyll commented 6 years ago

I see. Let's wait for @bogdandrutu's opinions on this.

bogdandrutu commented 6 years ago

@terinjokes I have some questions:

terinjokes commented 6 years ago

I'm primarily writing Go. Many of the services I interact with (which our likely using OpenZipkin instead of OpenCensus) are also written in Go, though some are not.

To prevent fragmentation of distributed tracing data, we've specified using B3 propagation between all services, where this is a reasonable ask. As linked above, this is defined for HTTP (headers) and GRPC (metadata), and we use similarly named fields when enqueing in a message queue.

B3 was chosen for its broad support between applications, languages, and frameworks.

rakyll commented 6 years ago

Talked to @bogdandrutu offline and we think providing an option for B3 propagation is fair and required to support legacy systems. We will keep using the binary format and add B3 headers if users wanted them.

bogdandrutu commented 6 years ago

@terinjokes to give you some reasons why, grpc-java is instrumented with OpenCensus (soon will have Go and the rest of the languages) and we use the binary format there because it is lower overhead (sending multiple grpc metadata is expensive) also sending binary vs hex is faster.

Long term we want all gRPC users to use the binary format but we are aware of the legacy instrumentation using b3 used by OpenZipkin and will try to work with them to switch to the new format as well.

codefromthecrypt commented 6 years ago

I would set expectations to make the grpc binary format possible vs expecting a switch. Switch is very hard in brown field especially as libraries that do B3 with grpc are not in openzipkin's org and we dont have strict control of people's site deployments. It is more complicated even to attempt to switch.

I think carrot will be that whatever binary format.. that this buys more than just serialization overhead as the cost to push people towards anything they arent currently focused on is quite high.

For example if the binary format is somehow related to trace context or something such that would be a decent sell as the folks who are less likely to move may be on account of not being quite concerned about the cost to marshal hex which to many is negligible as it is far under a microsecond and http header compression should take care of the repeated B3 keys.

Basically I would tie this leverage towards binary format into something not singularly focused on grpc rather a bigger win that helps with other frameworks too.

My 2p.

On Sat, 7 Apr 2018 06:22 Bogdan Drutu, notifications@github.com wrote:

@terinjokes https://github.com/terinjokes to give you some reasons why, grpc-java is instrumented with OpenCensus (soon will have Go and the rest of the languages) and we use the binary format there because it is lower overhead (sending multiple grpc metadata is expensive) also sending binary vs hex is faster.

Long term we want all gRPC users to use the binary format but we are aware of the legacy instrumentation using b3 used by OpenZipkin and will try to work with them to switch to the new format as well.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/census-instrumentation/opencensus-go/issues/666#issuecomment-379400330, or mute the thread https://github.com/notifications/unsubscribe-auth/AAD61xSZlkgDXYAsxYYv7he1KI4LtE7qks5tl-qjgaJpZM4TFfds .

codefromthecrypt commented 6 years ago

By the way, in Brave (java library), I am happy to help investigate a framework-specific (grpc in this case) override towards the current version of grpc format. That way using the format internal to grpc doesnt bleed into other systems like http.

Current practice in just about all libraries is to be consistent with one format vs having different ones per platform. It is more complicated but still willing to help. At the very least we could read incoming grpc specific trace format, maybe stashing that info for outgoing or dual propagating or something.

rakyll commented 6 years ago

For example if the binary format is somehow related to trace context or something such that would be a decent sell as the folks who are less likely to move may be on account of not being quite concerned about the cost to marshal hex which to many is negligible as it is far under a microsecond and http header compression should take care of the repeated B3 keys.

I asked @bogdandrutu about this a year ago and he should be the one to figure it out.

That way using the format internal to grpc doesnt bleed into other systems like http.

The gRPC format is not leaking into other systems because we immediately convert the header to a SpanContext and propagate it in a context.Context object inside the project. Are you talking about something else?

codefromthecrypt commented 6 years ago

That way using the format internal to grpc doesnt bleed into other systems like http.

The gRPC format is not leaking into other systems because we immediately convert the header to a SpanContext and propagate it in a context.Context object inside the project. Are you talking about something else?

Saying if this is special-cased for grpc, we should be as clear as possible. Early experiments seem "fine" for interop with the binary format, just it is a lot easier to contain if pinned to grpc vs all transports, especially with in-flight work on trace context possibly affecting us later.

Here's an example experiment where I only read/write grpc binary format when .. well in grpc https://github.com/openzipkin/brave/pull/688

terinjokes commented 6 years ago

@rakyll I think sending tracing metadata in both formats is a reasonable alternative.

I'll likely need to coordinate with the middleware (in this case, Envoy) to ensure when it creates child spans, it also updates the binary format.

rakyll commented 6 years ago

Maybe @bogdandrutu should update Envoy to use the binary format if it is the format gRPC blesses.

semistrict commented 6 years ago

I need this for Istio trace support. I will probably start working on it in the next day or so.

vaijab commented 6 years ago

I need this for Istio trace support.

I had an impression that Istio will have OpenCensus native support?

semistrict commented 6 years ago

For now, it uses B3 as the propagation format. I might try to support other propagation formats as a follow up but it requires changes in a number of different Istio components so want to just go with B3 for now.

rakyll commented 6 years ago

Duplicate of https://github.com/census-instrumentation/opencensus-specs/pull/136.

semistrict commented 6 years ago

We can use this to track implementation once the spec issue is accepted.

codefromthecrypt commented 6 years ago

so B3 gets the number 666 which while lucky in chinese is somewhat less so in other cultures ;)

fho commented 5 years ago

We also require support for B3 propagation especially on the GRPC server-side. We have GRPC-Clients in different languages (e.g. python, php, nodejs) which propagate the tracing information in B3 format to the GRPC-Servers. We can not change all applications at once to use the binary format for GRPC propagation. Support for B3 format would allow us to migrate to opencensus in the Go-GRPC-Servers and then migrate step by step the other apps to use opencensus.

jcxplorer commented 5 years ago

I've made a small proof of concept for this and tested it with Istio. It's a small change, so if there's interest in merging it, I'd be happy to clean it up, add tests, and make it optional in ocgrpc.ClientHandler. I don't think we need support in ServerHandler, because we still pass the current grpc-trace-bin. At least for distributed tracing with Istio, this is enough.

Any of the current maintainers willing to accept a PR?

tmc commented 4 years ago

Is there an opentelemetry equivalent for this issue?