DataDog / dd-sdk-ios

Datadog SDK for iOS - Swift and Objective-C.
Apache License 2.0
200 stars 124 forks source link

[Tracing] Swift Distributed Tracing support (or better async/await support) #1511

Open 0xpablo opened 11 months ago

0xpablo commented 11 months ago

I was looking to integrate DataDog's tracing today and I noticed there's a concept of the current execution context but I believe it's using the os_activity APIs based on threads, which doesn't work with code using async/await.

It would be great to either support Swift Distributed Tracing or to at least have better async/await support using Task Locals.

0xpablo commented 11 months ago

I've been experimenting a bit and defined a couple of helpers that seem to do the trick. I'll share them in case it's useful to someone:

Helpers:

enum SpanContext {
    @TaskLocal
    static var current: OTSpanContext?
}

public func withSpan<T>(
     _ operationName: String,
     context: OTSpanContext? = nil,
     _ operation: (OTSpan) throws -> T
 ) rethrows -> T {
     let span = Tracer.shared().startSpan(operationName: operationName,
                               childOf: context ?? SpanContext.current)

     defer { span.finish() }

     do {
         return try SpanContext.$current.withValue(span.context) {
             try operation(span)
         }
     } catch {
         span.setError(error)
         throw error
     }
 }

public func withSpan<T>(
     _ operationName: String,
     context: OTSpanContext? = nil,
    @_inheritActorContext @_implicitSelfCapture _ operation: (OTSpan) async throws -> T
) async rethrows -> T {
     let span = Tracer.shared().startSpan(operationName: operationName,
                               childOf: context ?? SpanContext.current)

     defer { span.finish() }

     do {
         return try await SpanContext.$current.withValue(span.context) {
             try await operation(span)
         }
     } catch {
         span.setError(error)
         throw error
     }
 }

Example:

            try await withSpan("Parent") { spain in
                try await Task.sleep(nanoseconds: NSEC_PER_SEC * 1)

                try await withSpan("Child 1 (Group)") { _ in
                    try await withThrowingTaskGroup(of: Void.self) { taskGroup in
                        taskGroup.addTask {
                            try await withSpan("Group Child 1") { _ in
                                try await Task.sleep(nanoseconds: NSEC_PER_SEC * 1)
                            }
                        }

                        taskGroup.addTask {
                            try await withSpan("Group Child 2") { _ in
                                try await Task.sleep(nanoseconds: NSEC_PER_SEC * 1)
                            }
                        }

                        try await taskGroup.waitForAll()
                    }
                }

                try await withSpan("Child 2") { span in
                    try await Task.sleep(nanoseconds: NSEC_PER_SEC * 2)
                }
            }

Produces:

image
ganeshnj commented 11 months ago

Thanks for suggestion @0xpablo and the example, very helpful.

We have been talking to support structured concurrency within the team already.

To your point on adopting/providing https://github.com/apple/swift-distributed-tracing support, we usually stay close to Otel (and Open Tracing) in terms of experience (I know it is not 1:1 mapping), but that's the aspiration.

I wonder, if OTSpan is Sendable, would make it any better?

0xpablo commented 11 months ago

That's great to hear! I think it would make sense to make OTSpan Sendable to be able to pass it around different concurrency domains. I think this should be easy as DDSpan can be marked as @unchecked Sendable since it's already thread safe right?

Thanks!

ganeshnj commented 11 months ago

Mostly yes, DDSpan can be marked as @unchecked Sendable but it has some properties like tracer instance which needs to be ignored.