opentracing / opentracing-java

OpenTracing API for Java. 🛑 This library is DEPRECATED! https://github.com/opentracing/specification/issues/163
http://opentracing.io
Apache License 2.0
1.68k stars 344 forks source link

Feedbacks from Apache SkyWalking project #257

Closed wu-sheng closed 6 years ago

wu-sheng commented 6 years ago

Hi, everyone. I got some time to review our new 0.31.0 release, and try to update SkyWalking Java OpenTracing implementor, but find some questions and problems about in thread propagation and across thread propagation.

The questions are all about propagation, and Span take too much duties, both propagation and tag/log data. ScopeManager should be in charge of propagation.

In Thread Propagation

Span can be created by tracer#buildSpan, and then need scopeManager#activate. Then what will happen if a span created by not active, but someone used it? By that question, my major point is that, why design the API in that way. Why don't just use scopeManager#buildSpan to create one? Allowing a span stays in not-active-yet can trigger questions.

Cross Thread Propagation

I feel like the cross thread propagation APIs is not so explicit than 0.30.0 and also cause me some questions. This is our documents say:

io.opentracing.Tracer tracer = ...;

// STEP 1 ABOVE: start the Scope/Span
try (Scope scope = tracer.buildSpan("ServiceHandlerSpan").startActive(false)) {
    ...
    final Span span = scope.span();
    doAsyncWork(new Runnable() {
        @Override
        public void run() {

            // STEP 2 ABOVE: reactivate the Span in the callback, passing true to
            // startActive() if/when the Span must be finished.
            try (Scope scope = tracer.scopeManager().activate(span, false)) {
                ...
            }
        }
    })
  1. The Span should be used in multi thread or a single one? That is important for people who use OT-java library to do real instrument for some frameworks having multi threads models.
  2. If the answer of first question is YES. Then, what happens, when the Span created, active, tag and log in a thread, then span#tag or span#log are called in child/another thread but not active?
  3. Also if the answer of first question is YES, why need support a span can be manipulated in multi thread? Multi threads always mean 1) job-assign from one thread to another. One span for parent, one span for children is enough and more clear. 2) need process something concurrently, Then, a span can't make sure the duration is right, because the real cost is based on the number of threads, and single span can't tell.

Also based on that, in SkyWalking implementation, we don't support a single span being manipulated in multi thread. I hope we can agree on this. And then, when cross thread happens, I proposal to use ScopeManager#capture and ScopeManager#continued (just example names) APIs to notify the thread changes.

I proposal this not just because SkyWalking did that, more important reason is most Java tracers/APMs use ThreadLocal to propagate in thread, which it is efficiency and easy, so they all need be notified. By efficiency, I mean the tracer didn't need to consider race condition (sync or lock) for span's methods.

At least, we should be clear what we recommend to do in async scenario for library end user and tracer implementor.

cc @opentracing/otsc @opentracing/otiab

carlosalberto commented 6 years ago

Hey,

Sorry for the delayed feedback. Items below:

Span can be created by tracer#buildSpan, and then need scopeManager#activate. Then what will happen if a span created by not active, but someone used it? By that question, my major point is that, why design the API in that way. Why don't just use scopeManager#buildSpan to create one? Allowing a span stays in not-active-yet can trigger questions.

Sometimes a Span needs to be created by a middleware layer and cannot be activated for some reason (such as the middleware not being able to guarantee the thread in which the actual code will be run). For this reason we want to allow users to have Span without being active. The common_request_handler in opentracing-testbed of this repo shows such case.

I feel like the cross thread propagation APIs is not so explicit than 0.30.0 and also cause me some questions.

Sounds like we make to make the README clearer and augment it too.

The Span should be used in multi thread or a single one? That is important for people who use OT-java library to do real instrument for some frameworks having multi threads models.

Span, at least for Java, should be able to be passed between threads (making it thread-safe). This is mentioned in the README, and it sounds we also need to highlight that :)

what happens, when the Span created, active, tag and log in a thread, then span#tag or span#log are called in child/another thread but not active?

It should work just fine (the calls to tag or log, that is), as long as there's a reference to the actual Span.

why need support a span can be manipulated in multi thread?

One example scenario is a Span used in a set of callbacks in a thread-pool, which means that the same work may be achieved through more than one thread.

Got the feeling the README could be improved. I will look into that ;)

wu-sheng commented 6 years ago

Sorry, I miss your comments.

Span, at least for Java, should be able to be passed between threads (making it thread-safe). This is mentioned in the README, and it sounds we also need to highlight that :)

Span cross thread would be a great challenge for SkyWalking. I am aware some of users could say they would, but in Java, especially a real native Java application, every few people really do so from my experience. Since I have help many Chinese big companies to build their systems/services, I think it is , at least in China.

Besides the experiences, allowing span across threads also make context very hard to finished. I am aware many tracing systems are only care of span, but in SkyWalking, we are care of Segment( all spans in a serve). At the same time, as an auto instrument agent, we can't count on users' #finish.

So, I have to say, we could not support API running in that way, Cost too much to maintain context and span to fix a very few user case. SkyWalking only supports context across thread, not span.

wu-sheng commented 6 years ago

@carlosalberto I know my gitter, if you have further questions, welcome to ping me. I hope we can find some solutions for these.

carlosalberto commented 6 years ago

Hey @wu-sheng

allowing span across threads also make context very hard to finished

After talking with a few people from the Cross Language group, we came with the impression that Span instances should be thread-safe or guarantee correctness when running on concurrent systems/applications. An example of this would be an application using a thread pool or scheduler, where a single task execution can spread among different threads.

We will be working on the adding this to the spec soon (as a Draft initially, of course - in order to give others a chance to jump in and leave their feedback).

wu-sheng commented 6 years ago

@carlosalberto I am aware that scenario. In SkyWalking, we use two spans in the two threads. And an across thread ref to link the two segment.

I know many manual SDK could support that.

carlosalberto commented 6 years ago

This is now being discussed as part of https://github.com/opentracing/specification/pull/120 - we will close it here once we have reached an agreement there ;)

carlosalberto commented 6 years ago

Wondering if we should close this issue and keep the discussion in https://github.com/opentracing/specification/pull/120, as previously mentioned. @wu-sheng any problem with that? Let me know ;)

carlosalberto commented 6 years ago

Closing this one in favor of opentracing/specification#120 Let me know otherwise ;)