Open keegancsmith opened 7 months ago
Does this change make sense? Should we keep in our application code the idea of if we should create a trace or not, or is that all magically handled somehow via grpc? Or for that do happen do I need to port our code to otel?
cc some more people who did tracing stuff @burmudar @bobheadxi
I think this makes sense:
Trace bool
is handled with custom propagation in sg/sg, but you should be able to access SpanContext - it's supposed to be serialized and propagated over the wire via OTEL propagators, which should apply for the gRPC middleware - and in TraceFlags
there is a standardized sampled
flag. I'm not sure it works OOTB - I think we would need to ensure Sourcegraph sets the trace state correctly in OTEL SDKs from the tracepolicy
stuff. My understanding is that the idea is in application code, you don't care, but in practice, if a parent indicates sampled=false
then all child spans will be created with that flag, then your exporter will just drop them. This is the behaviour I relied on in https://github.com/sourcegraph/sourcegraph/commit/4556b317f1f4494703d53383959be754b64d945d#diff-137bfca3e89a76caf2e8ae1694aacbb004e89389ff181f6291846480be15ed25R16-R25 , there's some description in the commit messageSpanContext
here is used for, but anything that should be propagated should be available in the parent span you retrieve from context if an OTEL middleware is set up on the service handlers to create the parent remote
span in contextYeah, this makes sense to me. Do we not use otel already? I was thinking I updated Zoekt at some point
Thanks for the comments!
I haven't forgotten about this PR, I'm just focussed on other projects at the moment. We do use OTEL, but not "natively". IE it all goes via opentelemetry at instrumentation time, but at configuration time we do it via OTEL and return a opentracing.Tracer from that.
We also still support using the jaeger clients via opentracing, which I believe is also deprecated and we should just use OTEL.
So I think I can take this PR further (or do a follow-up) and just rip out a bunch of stuff and make everything work with just OTEL.
Since the move to gRPC these fields should make no difference. SpanContext is never set anymore by our clients. Trace is still set, but I believe we should be communicating to trace or not via grpc.
Additionally we just rely on the opentracing global tracer. If tracing is not configured it will just use the noop tracer, not sure why we had our own implementation.
Test Plan: go test. Will seek more guidance on other tests to run.