open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.95k stars 855 forks source link

Support context propagation in Cats Effect library (Scala) #10599

Open gaeljw opened 8 months ago

gaeljw commented 8 months ago

Is your feature request related to a problem? Please describe.

When trying out auto-instrumentation in a Scala application (Play Framework), I noticed that some of my Play WS Client calls are instrumented but not related to their parent traces. As some others were, I dug a bit and found out it's related to my usage of Cats library (Cats Effect and Cats Retry).

I think the context is not propagated when using Cats Effect library.

Here's a minimal code sample to highlight the behavior I'm talking about:

import cats.effect.IO
import play.api.libs.ws.WSClient
import retry._ // Cats Retry

val wsClient: WSClient = ???

// If called directly (from the Play controller layer for instance), context is propagated, I see it as a span in the parent trace
def callSimple(): Future[X] = {
  wsClient.url("http://some-ws").get().map(response => X(???))
}

// When this one is called, context is not propagated, I see the callSimple() as a standalone trace, link with parent trace is lost
def callWithCatsRetry(): Future[X] = {
  val ioTask: IO[X] = IO.fromFuture(IO(callSimple()))

  val retryPolicy = RetryPolicies.limitRetries[IO](3).combine(RetryPolicies.constantDelay(100.millis))
  def onError(err: Throwable, detail: RetryDetails): IO[Unit] = ???

  val errorIOWithRetry: IO[X] = retryingOnAllErrors[A](retryPolicy, onError)(ioTask)

  errorIOWithRetry.unsafeToFuture()
}

Note that it also impacts further operations. Let's say I chain a second call without retries: the context is lost event though I don't use Cats in this second call.

// With this, both calls appear as a span in the parent trace
callSimple().flatMap { response1 => 
  callSimple().map { response2 =>
    // Do something with response1 and response2
  }
}

// With this, both calls appear as a standalone trace, even if the second callSimple does not use Cats Effect library
callWithCatsRetry().flatMap { response1 => 
  callSimple().map { response2 =>
    // Do something with response1 and response2
  }
}

Describe the solution you'd like

I'd like the context to be propagated in Cats Effect IO usage (and thus in Cats Retry) so that, in my example above, calling callWithCatsRetry gives me a span attached to the parent trace rather than a new trace.

Describe alternatives you've considered

I'll be looking at not using Cats Effect for this, likely by using a simpler retry library without IO type.

Additional context

For people not familiar with Scala ecosystem, Cats Effect is kinda similar to ZIO (for which there's some auto instrumentation available): it's a "asynchronous runtime" working with fibers (green threads).

gaeljw commented 8 months ago

I've seen there's a library otel4s (see also) which may be could serve as a guide to implement the context propagation :shrug: (I'm not very familiar with Cats Effect, I'm not sure if this even make sense TBH).

gaeljw commented 8 months ago

As pointed out to me, this may be implemented in a similar way as it was for ZIO in https://github.com/open-telemetry/opentelemetry-java-instrumentation/pull/7980;

laurit commented 8 months ago

We'd welcome a contribution for this

gaeljw commented 8 months ago

From @djspiewak (Cats Effect maintainer) on Discord:

(...) If you want to dig into this problem more deeply, I recommend cracking open IOFiber inside of Cats Effect core and going from there. This is the core of the fiber execution machinery, and it has direct awareness of when a fiber is running and on which thread. I believe there was a similar effort a couple years ago to get the DataDog agent working with CE.

Another (probably less invasive) option would be to see if you can do anything with IOLocal. There's an open PR (which will be in 3.6.0) which makes IOLocal usable from impure contexts, which seems close to what you would need to do something like this without hooking into Cats Effect's guts directly. https://github.com/typelevel/cats-effect/pull/3636