Azure / azure-sdk-for-go

This repository is for active development of the Azure SDK for Go. For consumers of the SDK we recommend visiting our public developer docs at:
https://docs.microsoft.com/azure/developer/go/
MIT License
1.65k stars 844 forks source link

[service-bus] Tracing #15678

Closed richardpark-msft closed 8 months ago

richardpark-msft commented 3 years ago

This issue tracks adding in distributed tracing using our azcore/tracing package, which has an adapter so it can report to OpenTelemetry.

Main issue: https://github.com/Azure/azure-sdk-for-go/issues/19280

There are some features missing to do linking and to generate diagnostic IDs, which are detailed in this comment: https://github.com/Azure/azure-sdk-for-go/issues/19280#issuecomment-1664902025

My plan is basically:

The actual spans we'll create are detailed in this spec from @lmolkova: https://gist.github.com/lmolkova/e4215c0f44a49ef824983382762e6b92

(previous text deleted now that we know we're going with our azcore/tracing package and we have a solid spec for what we're doing with regards to spans and their data)

devigned commented 3 years ago

https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/azuremonitorexporter is what we are using in Cluster API Provider Azure. You can see it used in comparison with Jaeger and App Insights here. There is some work to be done for metrics and log events, but it's not too far off. The comparison also shows traces including Azure SDK spans for HTTPS requests to Azure.

As for using tab, it might be better to have the SDKs own abstraction, one less dependency. It's relatively simple to implement a single provider. It would be more complex to provide a plugable solution where a user can bring their own. However, by standardizing on OpenTelemetry and exposing in that format, a user can use OpenTelemetry exporters to export in their format of choice.

lmolkova commented 3 years ago

Azure Monitor plans in Go do not include an in-process exporter, but they are going to have an official collector-based exporter (out-of-process agent) in the long term. As @devigned just commented there is unofficial one.

I'm curious to learn what API we'd want to expose to users for tracing - they'd have OpenTelemetry (or something else) either way if they want to trace and in other languages, we target zero to a few lines of configuration code needed to enable Azure SDK tracing.

In the happy case, all users need to interact with lives in OpenTelemetry/Tab(Census/Tracing).

devigned commented 3 years ago

but they are going to have an official collector-based exporter (out-of-process agent) in the long term.

@lmolkova, can you expand on why we wouldn't continue to build upon the "unofficial" exporter? Seems to work pretty well thus far. Also, how long is long term? Are we talking days, months, years?

lmolkova commented 3 years ago

I don't really know if it will be built on top of an unofficial one or will be rebuilt from scratch for some reason. There is no timeline, so unoffical exporter is as good as it gets in the near future. Here's the source: https://techcommunity.microsoft.com/t5/azure-monitor/opentelemetry-azure-monitor/ba-p/2737823

richardpark-msft commented 2 years ago

Just to refresh this discussion a bit, we'll probably remove our direct dependency on 'tab' and just create a light pluggable abstraction, similar to what we provide in github.com/Azure/azure-sdk-for-go/sdk/azcore/log.

JeffreyRichter commented 2 years ago

Whatever happened to the tracing design we discussed? Did this ever get prioritized and assigned to anyone?

richardpark-msft commented 2 years ago

Nothing's moved forward here, beyond the original sketch that you shared with @lmolkova and I.

darrenparkinson commented 2 years ago

With regards tracing, what is the appropriate way currently to achieve propagation between services using open telemetry and service bus? From the service bus docs it says:

Microsoft Azure Service Bus messaging has defined payload properties that producers and consumers should use to pass such trace context. The protocol is based on the W3C Trace-Context.

I'm assuming this only works with .Net at the moment?

Thanks for any information you can provide.

richardpark-msft commented 2 years ago

With regards tracing, what is the appropriate way currently to achieve propagation between services using open telemetry and service bus? From the service bus docs it says:

Microsoft Azure Service Bus messaging has defined payload properties that producers and consumers should use to pass such trace context. The protocol is based on the W3C Trace-Context.

I'm assuming this only works with .Net at the moment?

Thanks for any information you can provide.

@lmolkova, can you answer this question? This particular issue is about Go and what we're considering but, assuming we fall in line with the rest of the SDKs, will this solve @darrenparkinson's requirement as well?

lmolkova commented 2 years ago

@darrenparkinson As @richardpark-msft mentioned, ServiceBus Go SDK does not yet support opentelemtery yet, but we're working on it.

In the meantime, the best option would be to write your own instrumentation:

  1. Follow Otel messaging semantic conventions to create spans with specific attributes.
  2. Propagate context using Message.ApplicationProperties. Here's an example of how it's done in HTTP instrumentation.
  3. If you have ServiceBus producer and consumer both written in Go and you're going to instrument them both, you can use W3C Trace-Context propagator from OTel.
    • If you have one of them written in another language (.NET, Java, Python, JS) where you leverage existing SDK instrumentation, you'd have to implement your own propagator for Diagnostic-Id. You can use W3C Trace-Context as a starting point and rename traceparent to Diagnostic-Id, tracestate is not supported, s you can remove any mentions of it.
richardpark-msft commented 1 year ago

Reviving the work here now that we have a tracing implementation in azcore in beta. Can't ship with it in that state quite yet, but I'll start prototyping so we make sure everything we need is in there.

This is the current spec for spans: https://gist.github.com/lmolkova/e4215c0f44a49ef824983382762e6b92

richardpark-msft commented 1 year ago

Some questions we want to make sure we can answer from this:

github-actions[bot] commented 8 months ago

Hi @richardpark-msft, we deeply appreciate your input into this project. Regrettably, this issue has remained inactive for over 2 years, leading us to the decision to close it. We've implemented this policy to maintain the relevance of our issue queue and facilitate easier navigation for new contributors. If you still believe this topic requires attention, please feel free to create a new issue, referencing this one. Thank you for your understanding and ongoing support.