open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.89k stars 825 forks source link

Thread.startVirtualThread does not propogate context #11950

Open anuragagarwal561994 opened 1 month ago

anuragagarwal561994 commented 1 month ago

Describe the bug

When a virtual thread is started context should propogate from the thread starting the virtual thread, but this is not happening instead a new context is started.

Steps to reproduce

Thread.startVirtualThread(() -> log.info("span")) this will initiate a new context and not log the respective trace id originated from the parent context

The same is also application for methods such as:

Thread.ofPlatform().start(() -> log.info("span"))
Thread.ofVirtual().start(() -> log.info("span"))

Expected behavior

Context propagation should happen and the virtual thread should have the same context as the thread initiating the start.

Actual behavior

Context propagation does not take place

Javaagent or library instrumentation version

2.5.0

Environment

JDK: Temurin-21+35 (build 21+35-LTS) OS: MacOs

Additional context

No response

heyams commented 4 weeks ago

Please check out this doc https://opentelemetry.io/docs/languages/java/instrumentation/#context-propagation-between-threads. Hope that helps.

anuragagarwal561994 commented 4 weeks ago

No this I know, just wanted to understand why the executors instrumentation doesn't handle this case.

trask commented 3 weeks ago

Related to #9534

@JonasKunz wdyt?

JonasKunz commented 3 weeks ago

Highly related to https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/2527, because those methods simply delegate to new Thread(..).start().

Contrary to my previous comment in that issue I think that explicitly spawning threads from within tasks is an anti pattern in comparison to using Executors.newVirtualThreadPerTaskExecutor, because it is easy to leak the thread and forget to join it. The executor interface was specifically extended to be AutoCloseable to prevent this issue, all started threads will be joined:

try(var executor = Executors.newVirtualThreadPerTaskExecutor()) {
   Future<?> result1 = executor.submit(...);
   Future<?> result2 = executor.submit(...);
} //waits for all submitted tasks to terminate

This pattern will be further enhanced with structured concurrency ensuring that no asynchronous processing is leaked outside of the starting method (or span in our context).

So if users follow those patterns, then new Thread(...).start() is primarily used for spawning continously running background tasks, for which I don't think that we'd want context propagation to happen.

Of course, there will be code bases not following this pattern, just like there are codebases explicitly starting and joining platform threads within tasks. For that reason it could make sense to provide an instrumentation for Thread.start but have it opt-in and configurable like I suggested in this comment.

anuragagarwal561994 commented 3 weeks ago

That is a very good explanation to understand the issue, the thing is in some cases you would want to do things in background in virtual threads, but thats short lived thing. So for example I have an entity in my request which I would want to cache asyncronously in my distributed cache so I can just do Thread.startVirtualThread and won't join it as with timeouts and everything configured in the actual sync call inside of the runnable I can assume that it would get complete. In such cases may be it would be beneficial for us to have context propogation

JonasKunz commented 2 weeks ago

I agree that there are cases / preferences where virtual threads might be directly started. That's why I think it could be useful to have the instrumentation, but have it opt-in. It might also be difficult to get that instrumentation to not interfer with the existing executor instrumentations, so it could be more work to implement than it initially seems.