Open tylerbenson opened 5 years ago
Thanks @tylerbenson. I agree the tag convention should be reverted.
I like that this won't accidentally result in a tag "service" which would be hard to grok whether it was an accident of legacy instrumentation or a new thing. It also addresses the first-class nature of service in distributed tracing. Finally, it deals with scoping properly.
One thing I would suggest is keeping this feature on the tracer as opposed to overloading the scope with more methods.
Ex
Span span = tracer.buildSpan("stuff").start();
try (Scope scope = tracer.activateForService(span, "who-I-proxy-for")) {
// newScope is now active scope and service name is overridden for any new children spans.
}
note the span itself here should be only assigned to one service, so this is not only the children but also the activated one itself. There might be some nuance to how this plays out.
Note: the above allows you to switch services multiple times: nested scopes. If that's a feature it should end up in the tests.
An alternative would be to not allow nesting and do something similar to baggage which is write-only.
For example, adding a serviceName
method to the span builder.
Ex
Span span = tracer.buildSpan("stuff").serviceName( "who-I-proxy-for").start();
try (Scope scope = tracer.activate(span)) {
// newScope is now active scope and service name is overridden for any new children spans.
}
The pro of this is that it allocates the span from the beginning as that service. Also, it can't accidentally end up ambiguously as a "service" tag in old instrumentation. OTOH, this has the same implementation gotchas as baggage (Except the remote propagation part), even if it is only scoping a single property (service). That said, some tracers may already have a .serviceName method on their span which lets this drop in.
food for thought.. regardless, this should be tested in practice against something that proxies, and directly linked.
Test cases should include at least:
+1 to have this integrated with span creation rather than scope.
But the description could use better justification. The only one I heard is old instrumentation already using "service" tag for different purposes. I think it's a good enough reason (I wasn't a fan of the tag in the first place), but are there other considerations?
@yurishkuro to be clear, you would prefer @adriancole's second suggestion, to see this associated with the span builder and have it propagated more like baggage?
I was only referring to the builder part. Auto-propagating sounds like assuming too much, but I can see the rationale for that. Perhaps it also needs to be explicit.
I don't think anyone is suggesting for it to propagate across processes, just referring to how baggage is shared between spans.
In looking at prior art (like finagle) and also offline chat with @codingfabian also thinking about this..
I think the simplest design is requiring the tracer to inherit parent-child any time it is set. This can't be assumed in tags, not safely IMHO as no other data in tags act like that. I think the span builder is the best way to achieve this in the OT api, possibly named "spanBuilder.serviceName"
For folks who are already using the "service" tag convention (ex @codingfabian or @jeffbakerexpedia), we can treat it like any other data which is sortof "old school". For example, when tracers upgrade, if they see a tag named "service" they can try to re-forward it to the correct api. This would work at builder time for example, even if be a bit dodgy later.
Aside: to underpin this, a span.serviceName() or similar method would be needed in brave to properly inherit. We were to do this anyway, just waiting for rule of three anyway. For example, we knew camel use the forwarding stuff and currently hack by using different tracers. It needed to be a repeating pattern before escalating to an api. I can manage adding this there.. cc @oscerd
I like the idea of service
being a more concrete concept in the spec.
One question on scope: the service
tag was added as an optional way to implement a limited set of behavior - out of band reporting on multiple services. But the "service scope" comes up more broadly in tracing than just this usecase:
How do others feel about these broader usecases? If we're going to add serviceName as an explicit concept to a span, I feel like we should consider the implications beyond just out-of-band reporting.
I am also +1 on doing this at span creation rather than scope. We have an internal use case at Expedia in which our build system is using Expedia's Haystack tracing system to visualize our CICD pipeline; we currently do this with the tag-based override but will change our code to support the new spec when it changes.
@tedsuo What should be our path forward here?
I agree with @adriancole to have serviceName as a first class citizen and not as a tag. Whether we call it as serviceName or applicationName to be generic, I prefer an API like what Adrian proposed above
Span span = tracer.buildSpan("stuff").serviceName( "who-I-proxy-for").start();
This forces serviceName to be a concrete member of the data model and not as a tag.
On a side note, I would say the same for remoteServiceName instead of or in addition to peer.service tag. Though this need not be mandatory, having it helps with building better service graphs. There is a related discussion in Haystack https://github.com/ExpediaDotCom/haystack/issues/537
Service name is insufficient. Jaeger uses Process
(instance, resource, workload are the other possible names), which contains a service name and a set of tags that contain additional metadata about the workload (host name, IP, port, zone, environment, code version, etc.).
It is different from remote service (peer.service
) because the current workload may not know much metadata about the peer service, and the tag itself is generally not that critical.
I think the simplest design is requiring the tracer to inherit parent-child any time it is set. This can't be assumed in tags, not safely IMHO as no other data in tags act like that. I think the span builder is the best way to achieve this in the OT api, possibly named "spanBuilder.serviceName"
+1 to inheritance & builder method (but replace "name" with a more complex object).
I submitted a couple PRs (see above) to try to revert this mess I made. I suggest we get these merged and proceed forward with a separate PR to add service name as a first class API.
@tylerbenson I would prefer to make changes backwards compatible, i.e. deprecate rather than remove.
@yurishkuro I don't think it needs to be deprecated in the java repo since it was added since the last release, so technically not a breaking change there. I would consider changing it to deprecated in the spec though with the assumption that means it probably shouldn't be added to language implementations if not already there.
ah, if it never made it to a public release, then sgtm
Background
A while ago, we added
service
as a tag (#119). This is overly simplistic and has high probability for user confusion.Problem
The problem is we scoped it as a tag specifically for a narrow use case, but as implemented, it is trivial for users to use it in ways that can fundamentally break common tracer logic.
Proposal
I propose we roll back/remove service as a tag, and move the ability to override the service name to the
Scope
.Perhaps something like this (in java parlance):
This also opens the possibility of additional
Scope
scoped tags or other things.Consideration
I don't believe any releases have been published with the new
service
tag included, so at a minimum, we should at least revert #119 and https://github.com/opentracing/opentracing-java/pull/287 before they are released to prevent the need for a longer tail deprecation.