census-instrumentation / opencensus-java

A stats collection and distributed tracing framework
https://opencensus.io
Apache License 2.0
672 stars 201 forks source link

Support auto-collection of HTTP calls #1508

Open dhaval24 opened 5 years ago

dhaval24 commented 5 years ago

Goals:

  1. Support automatic interception of inbound and outbound HTTP calls (Interprocess HTTP communication).
  2. Support popular HTTP Clients (namely Apache HTTP Client, OKHtttp Client, Native Java HTTP Client).

Possible Options:

Interception of Inbound calls: Design a ServeletFilter to intercept inbound calls, extract headers and create spans from it with required metadata from request parameters.

Challenges: ThreadLocal may not be used possibly to store context, as it isn't persisted across child threads that might be created in order to process the incoming requests. Brave solves this by passing the raw object manually. @adriancole to confirm this.

Patching outbound calls with Headers:

  1. Use a decorated client for popular HTTP clients. Example: https://github.com/openzipkin/brave/tree/master/instrumentation/httpclient Advantages: Easy to maintain and less fragile. Disadvantages: Requires the users to modify the HTTPClient creating in all the parts of their codebase.

  2. Hybrid approach: Use Byte Code Instrumentation to patch outbound calls using JVM Agent for popular HTTP Clients like ApacheHTTP and OKHttp and provide decorator based instrumentation for less popular clients. Advantages: Easy to use for majority of users with minimal code. Disadvantage: Fragile and depends on method signatures. Harder to test.

I would like to hear community opinion on this topic.

CC: @rghetia @adriancole @reyang @grlima

bputt commented 5 years ago

I would prefer # 1 for today with the eventual support of # 2. I imagine the "provide decorator based instrumentation for less popular clients" would essentially refer to # 1.

dhaval24 commented 5 years ago

@bputt yes, you are correct. "Provide decorator based support" is essentially # 1 for less popular clients. I would imagine something like Reactor (not less popular per say, but hard to instrument in agent IMO).

bputt commented 5 years ago

I'd like to suggest this supporting

  1. Automatic injection of X-B3 headers
  2. Automatic injection of a set of headers w/ values (For example, if we added an additional header to help our clients determine their sampling rate based on what the upstream system was)
  3. Ability to choose what attributes get added to the span (maybe we don't want the POST payload as an example)
  4. Ability to turn off the creation of spans, but will still inject headers
rghetia commented 5 years ago

I'd like to suggest this supporting

  1. Automatic injection of X-B3 headers
  2. Automatic injection of a set of headers w/ values (For example, if we added an additional header to help our clients determine their sampling rate based on what the upstream system was)
  3. Ability to choose what attributes get added to the span (maybe we don't want the POST payload as an example)
  4. Ability to turn off the creation of spans, but will still inject headers

1 - Default propagation should be W3C Trace context. However, user can choose to override with B3.

4 - can you elaborate on this?

bputt commented 5 years ago

For # 4...Let's just remove that suggestion as # 3 would suffice so that I'm limiting the amount of data within the span

dhaval24 commented 5 years ago

I'd like to suggest this supporting

  1. Automatic injection of X-B3 headers
  2. Automatic injection of a set of headers w/ values (For example, if we added an additional header to help our clients determine their sampling rate based on what the upstream system was)
  3. Ability to choose what attributes get added to the span (maybe we don't want the POST payload as an example)
  4. Ability to turn off the creation of spans, but will still inject headers

For # 1 @rghetia and @bputt I would think a different approach. Eventually, I think it would be a fine choice if we could have the interop between multiple headers. I think we should be able to solve a scenario of connected services, regardless of which header format they use (An Open Census Java service should be able to talk to Sleuth instrumented java service having B3 header and vice versa). I would suggest the following:

Have observer pattern to understands incoming header format and the delegate that could transform the incoming header to the new format

This would help in solving Sleuth service talking to Census service and show distributed transactions. It would help eliminate the force need of users to adopt to a specific protocol. In case of enterprise languages like Java and C# it is often very hard for users to migrate very often.

Let me know your thoughts. We can start a separate issue for this and keep this one focused on ideas around HTTP instrumentation though.

bputt commented 5 years ago

I like the concept and could see potential issues where the originating system is using opencensus and a downstream system might be jaeger/zipkin (whether it's code or using something like a service mesh in k8s (something other than istio)) and unless those support the W3C Trace Context headers, then you'd lose the trace...However, that might not be an issue that opencensus should tackle?

dhaval24 commented 5 years ago

@bputt in order for the wole tracing to work other libraries would also have to adopt a similar approach of interoping different headers, but in my perspective it would be great if OpenCensus instrumented libraries can handle scenarios like that (I think this would be responsibility of census), though I do believe census would not be responsible for downstream libraries to ensure same behavior. Thoughts?

bputt commented 5 years ago

I agree, opencensus should not be responsible for downstream libs to behave the same. I can see the ecosystem supporting a variety of headers (Jaeger supports X-B3 and I'm guessing will support W3C Trace Context if not already)

dhaval24 commented 5 years ago

@bputt agreed, but it would not be 100% the case that downstream libraries would support W3C (we can hope so). So just to clarify my previous comment:

OC -> Have ability to interop (with pluggable model for user to define header format, if there are some propritory ones in their system) various headers, and convert them to W3C / B3 or whatever decided for outbound calls.

So at very high level this is how it might look like:

Incoming request -> Header parsed -> Identify respective header format -> delegate to respective handler / observer for processing headers based on format -> send outbound headers with W3C or B3.

Does the example help?

rghetia commented 5 years ago

@dhaval24 , as you said earlier let's start a separate issue for this and focus on http instrumentation here.

bputt commented 5 years ago

@dhaval24 Your example makes sense and helps, thanks

dhaval24 commented 5 years ago

@bputt I will start a separete issue to focus on header transformation. For brevity lets keep the issue for HTTP instrumentation.

@rghetia here are my few questions and concerns regarding decorator based approach.

  1. It is feasible only when user has access to code. How about for example, redis client making a downstream call with Apache HTTP Client (which is of-course not decorated)? We would break tracing there.

  2. User would need to make code changes to update to new client in several parts of their code base. This would be exhausting.

rghetia commented 5 years ago

@dhaval24 providing instrumentation for client library is one approach. Using same example of redis, we have jedis instrumentation today to collect metrics and traces for redis operation. But I don't believe it propagates tracecontext downstream. It is only useful to propagate the context if the downstream (redis) is also instrumented.

dhaval24 commented 5 years ago

@rghetia yes you are correct, and that is because I believe we have instrumented Jedis client with decorator. However, if we instrument the client using agent (bytecode way) then the downstream calls will also propagate imo. So in short there are pros and cons. We should chose based on the community needs. What do you think ?