Micrometer tracing - Githubissues

adamalexandru4 commented 9 months ago

[X] I am running the latest version
[X] I checked the documentation and found no answer
[X] I checked to make sure that this issue has not already been filed

Expected Behavior

Micrometer tracing available on each task execution.

Would be great if the ExecutePicked would be wrapped in a tracing context: One traceId, and different spans for complete or failure or even the same spanId.

Current Behavior

No tracing available

adamalexandru4 commented 8 months ago

@kagkarlsson If you agree with this idea, I can contribute the required changes.

kagkarlsson commented 8 months ago

I think that makes sense yes, as long as the code is acceptable. Not sure if we have all we need for it, I was thinking it could be implemented with what is now called StatsRegistry (might be promoted to some sort of Listener interface)

adamalexandru4 commented 4 months ago

The starting idea would be to play around with Tracer interface from micrometer-tracing inside the methods of ExecutePicked class. By default the Scheduler, would have a Tracer.NOOP implementation but in Spring Boot module we could try to detect the bean from ApplicationContext and inject it.

private void executePickedExecution(Execution execution, CurrentlyExecuting currentlyExecuting) { 
    .....
    Span taskExecutionSpan = tracer.nextSpan().name("task-execution")
      .tag("task-name", execution.taskInstance.getTaskName());
    try (Tracer.SpanInScope spanInScope = tracer.withSpan(taskExecutionSpan)) {
      ....
    } catch (..) {
      .....
   }  finally {
      taskExecutionSpan.end();
    }
 }

private void complete(CompletionHandler completion, Execution execution, Instant executionStarted) {
    ....
    Span taskCompleteSpan = tracer.nextSpan().name("task-complete")
      .tag("task-name", execution.taskInstance.getTaskName());
    try (Tracer.SpanInScope spanInScope = tracer.withSpan(taskCompleteSpan)){
        .... 
    } catch (..) {
      .....
   }  finally {
      taskCompleteSpan.end();
    }

private void failure(
      Task task,
      Execution execution,
      Throwable cause,
      Instant executionStarted,
      String errorMessagePrefix) {
    ....
    Span taskFailureSpan = tracer.nextSpan().name("task-failure")
      .tag("task-name", execution.taskInstance.getTaskName());
    try (Tracer.SpanInScope spanInScope = tracer.withSpan(taskCompleteSpan)){
        .... 
    } catch (..) {
      .....
   }  finally {
      taskFailureSpan.end();
    }

NicklasWallgren commented 3 months ago

We would also benefit from this, maybe OpenTelemetry could be a viable option instead of Micrometer Tracing?

I don't think it's possible to implement tracing using the StatsRegistry in its current form. It would require additional events in ExecutionStatsEvent (i.e. STARTED), and passing Execution to StatsRegistry#register so that we can add span attributes such as taskName and id.

kagkarlsson commented 3 months ago

A loose plan is to deprecate StatsRegistry and replace it with one or many Listener interfaces, also covering started and completed events. If anyone would like to take a stab at that (adding some form of adapter for StatsRegistry), feel free to try

NicklasWallgren commented 3 months ago

I've created a basic proof of concept that divides ExecutePicked into two versions: DefaultExecutePicked and ObservableExecutePicked. You can review the changes in the commit linked below: https://github.com/NicklasWallgren/db-scheduler/commit/5e2d23dd71a0c3043ab59a2da1b282579a05f1cf

However, a more efficient approach might be to transform StatsRegistry into a comprehensive listener interface, which would prevent multiple implementations of ExecutePicked

kagkarlsson commented 2 months ago

How does #504 work for this? Specifically:

  void onExecutionStart(CurrentlyExecuting currentlyExecuting);
  void onExecutionComplete(ExecutionComplete executionComplete);

kagkarlsson commented 2 months ago

@NicklasWallgren Any chance you could try reimplementing your PoC using the new SchedulerListener?

NicklasWallgren commented 2 months ago

@NicklasWallgren Any chance you could try reimplementing your PoC using the new SchedulerListener?

Great! It should possible to implement tracing with help of the new SchedulerListener.

I've implemented a temporary ExecutionHandler-wrapper, which bootstraps the trace/transaction before kicking of the delegated/actual execution handler.

public class ElasticApmExecutionHandler<T extends ObservableTaskContext> implements ExecutionHandler<T> {
    private static final String TRANSACTION_TYPE = "queue.task";

    private final ExecutionHandler<T> executionHandler;

    private ElasticApmExecutionHandler(final ExecutionHandler<T> executionHandler) {
        this.executionHandler = executionHandler;
    }

    @Override
    public CompletionHandler<T> execute(final TaskInstance<T> taskInstance, final ExecutionContext executionContext) {
        final Execution execution = executionContext.getExecution();

        final T data = taskInstance.getData();

        final Transaction transaction = ElasticApm
            .startTransactionWithRemoteParent((it) -> TraceIdHeaderValueFactory.create(data.traceId(), data.spanId()));

        try (final Scope scope = transaction.activate()) {
            transaction.setName("QueueTask#" + execution.getTaskName());
            transaction.setType(TRANSACTION_TYPE);

            transaction.setLabel("queue.task.id", execution.getId());
            transaction.setLabel("queue.task.name", execution.getTaskName());

            return executionHandler.execute(taskInstance, executionContext);
        } catch (final Exception e) {
            transaction.captureException(e);

            throw e;
        } finally {
            transaction.end();
        }
    }

    public static <T extends ObservableTaskContext> ElasticApmExecutionHandler<T> wrap(final ExecutionHandler<T> executionHandler) {
        return new ElasticApmExecutionHandler<>(executionHandler);
    }

    private static class TraceIdHeaderValueFactory {
        /**
         * Builds a trace parent which follows the W3C TraceParent standard.
         * <p>
         * Format: trace-id "-" span-id "-" trace-flags
         *
         * @return the trace parent
         */
        public static String create(final String traceId, final String spanId) {
            return String.format("00-%s-%s-01", traceId, spanId);
        }
    }
}

kagkarlsson commented 2 months ago

After some consideration I added an interceptor-concept as well. See PR #504

kagkarlsson / db-scheduler

Micrometer tracing #451

Expected Behavior

Current Behavior