open-telemetry / opentelemetry-js

OpenTelemetry JavaScript Client
https://opentelemetry.io
Apache License 2.0
2.67k stars 768 forks source link

how to set the current active context and current active span without callbacks #3558

Open tareksalem opened 1 year ago

tareksalem commented 1 year ago

I started using open telemetry with nodejs, actually found less resources and concrete documentation for setting it up with nodejs, finally I managed to instrument my nodejs project but I have a question,

for now there are two main ways to set the current active context using the following

I want to achieve something like this

const mySpan = tracer.startSpan('TEST_ONE', {}, api.context.active());
api.context.setActiveContext(newContextHere)
const childSpan = tracer.startSpan("TEST_TWO", {}, api.context.active())

I think this should be doable because maybe I want to create a child context from the parent one and I need to access on it from another function, so why force doing that in a callback? also spans is an implementation details so it should not affect my code, but instead of that, I am forced to wrap my code inside callbacks for spans!!

The second question is

how to set the current active span?

the documentation also is lacking this!!

thank you

legendecas commented 1 year ago

You can start a child span without setting the parent span as the current active span like:

const ctx = api.trace.setSpan(api.context.active(), span);
tracer.startSpan('my-span-name', {}, ctx);

And you can set the current active span with:

const ctx = api.trace.setSpan(api.context.active(), span);
api.context.with(ctx, fn);

as documented in the example: https://github.com/open-telemetry/opentelemetry-js/blob/main/doc/tracing.md#describing-a-span

mjhenkes commented 1 year ago

Hello, I too am looking for a way to manually set an active context without using a wrapping with or setActiveSpan call.

My use case is that i'm propagating context from a server to the browser and I want all spans created in the browser to have the parent context set, however there isn't a clear method in my code to wrap in order to set the active context.

Thus far I've created a wrapping class that manually injects the context in each start span i call, but this solution won't work for auto instrumentation, since they won't know to call my function.

I've looked at creating a custom context provider, that allows the context to be manually set, but I don't want to go down that path unless I don't have any other options.

It feels like I'm missing something obvious. Since you can manually add parent spans to contexts it seems logical that you could also manually manage the active context. 🤷‍♂️

tareksalem commented 1 year ago

You can start a child span without setting the parent span as the current active span like:

const ctx = api.trace.setSpan(api.context.active(), span);
tracer.startSpan('my-span-name', {}, ctx);

And you can set the current active span with:

const ctx = api.trace.setSpan(api.context.active(), span);
api.context.with(ctx, fn);

as documented in the example: https://github.com/open-telemetry/opentelemetry-js/blob/main/doc/tracing.md#describing-a-span

the problem here, I have to pass a callback function, and callbacks are old fashion, not readable, and increase the complexity of code, is here another way to do this without callback?

Flarna commented 1 year ago

One may name it callback someone else names it function :o). I doubt that functions are old style and increase complexity in general. Depending on usecase a simple arrow function or some free function may fit better. Just put the part where context needs to be active into a function.

The reason for using a function here is to limit the scope where the context is active. If there would be a enter()/exit() function or similar the burden to ensure that exit() is called in all code pathes (exceptions,...) even in presence of nested enter() calls would be on the user. Wrong sequence would be undefined behavior, missing exit would be a leak.

So at least for exception handling a try/finally block would be needed so the extra braces + intention is needed anyway. But in current API design it's done once for all in Otel not clutter across the world of source code.

And there would be other pitfalls like use of await like this:

api.context.enter(ctx);
doSomeSyncStuff(); // ctx is active here, good
await doSomeAsyncStuff() // ctx is active until await returns, not good
api.context.exit(ctx) 

correct would be

api.context.enter(ctx)
doSomeSyncStuff(); // ctx is active here, good
const p = doSomeAsyncStuff() // ctx is active until await returns, not good
api.context.exit(ctx) 
await p;

Otherwise all microtasks - even that ones not related to doSomeAsyncStuff() - would likely see ctx as active.

No issue with context.with().

tareksalem commented 1 year ago

In my opinion, spans are categorized under "implementation details" and near-to-infrastructure implementation rather than business implementation so the developer should not consider opening spans and closing them instead of focusing on implementing the business logic, that's why opentelemetry has auto instrument functionality, js functions is everywhere and no one said functions are old fashion, but callbacks is a pattern to use functions and functions can be used using other patterns, so I think using these two concepts interchangeably doesn't convince me

regarding tracking which active context and worrying if the developer didn't close it, opentelemetry could have a track of the created spans across the request, and when the request ends close unclosed spans, this could be added to the auto instrument functionality to cover these edge cases

Flarna commented 1 year ago

How should OTel know which span belongs to a request? What is the exact definition of a request in general (not just HTTP/GRPC/...)?

You define in your code when some "operation" starts/end. And it's up to you to decide what is worth to monitor by a span. Also the nesting of spans can't be known by OTel based on nothing, therefore you have to set the span as active if you want something nested.

If you don't create spans in your code because you use auto instrumentations and they create enough spans for your needs then there should be no reason to set the active context. e.g the HTTP instrumentation is not only starting/ending a span, it calls also context.with() before your handler is called. A request to mongodb done somewhere in your HTTP handler code should result in a nested span out of the box.

If there is a concrete problem/context loss with an instrumentation this should be corrected with the instrumentation and not by all users.

tareksalem commented 1 year ago

How should OTel know which span belongs to a request? What is the exact definition of a request in general (not just HTTP/GRPC/...)?

I think opentelemetry uses async local storage from nodejs to handle its context, right? if yes then this could be doable by using nodejs async local storage also https://nodejs.org/api/async_context.html

If you don't create spans in your code because you use auto instrumentations and they create enough spans for your needs then there should be no reason to set the active context. e.g the HTTP instrumentation is not only starting/ending a span, it calls also context.with() before your handler is called. A request to mongodb done somewhere in your HTTP handler code should result in a nested span out of the box.

I see, but imagine this case, sometimes I want to create nested spans manually but at the same time I don't want use callback pattern to create it

If there is a concrete problem/context loss with an instrumentation this should be corrected with the instrumentation and not by all users.

Flarna commented 1 year ago

AsyncLocalStore is on no help here as there is no 1:1 mapping of a span to an AsyncResource.

Also ending all spans at the end of a request would be quite bad because it's perfectly fine to have a span running longer then the actual HTTP request, e.g. some cleanup, caching,... triggered within the request but not actually needed to complete it so it's allowed to run longer. So OTel SDK should not forcibly close such spans.

I see, but imagine this case, sometimes I want to create nested spans manually but at the same time I don't want use callback pattern to create it

Could you propose an API which fulfills your needs and is still safe regarding all the points I mentioned above (nested use, exception safe)? Adding a footgun API just to avoid old style callbacks doesn't sound promising.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

ivan-nejezchleb commented 1 year ago

Callbacks are useless in complex front end applications: Component with onClick callback start span -> action creator dispatching action --> redux-saga handling action and starting fetching data ---> multiple auto instrumented API requests to back end --> saga putting action to propagate fetch result into redux appState -> component rendering data end span

@Flarna How do you want to put all this into single function required as param for context.with if I want to encapsulate all this to single Span or even have parent span for whole use case and child spans for individual data loading?

Flarna commented 1 year ago

There were quite complex node.js applications before await was part of the language.

Anyhow, in node.js quite some effort was made to utilize the promise hooks exposed by v8 to have context tracking across await so code like this works fine there:

async function monitorMe() {
  await tracer.startActiveSpan("MyName", async (span) => {
    try {
      const fooResult = await foo(); // this or some inner function my create child spans
      span.setAttribute("fooResult", fooResult);
      const barResult = await bar(); // this or some inner function my create child spans
      span.setAttribute("barResult", barResult);
      // ...
    } catch (e) {
      span.recordException(e);
    } finally {
      span.end();
    }
  });
}

For browsers it's not that simple because they don't expose any functionality for async context tracking as of now. I'm not a browser expert but as far as I know Promises are instrumented by zone.js and therefore transpile to an older language level could be a workaround.

The TC39 proposal should result in a language level solution for this problem.

tareksalem commented 1 year ago

@Flarna you provided nice syntax in this code snapshot but for me, it's still a work around, what I meant by this ticket is thinking about a smarter rather than callbacks, I will work on something and post here as a proposal

ivan-nejezchleb commented 1 year ago

@Flarna My issue is not await/async syntax. In complex React apps, from moment of click to Button, through fetching some data to show them in UI, this all can be considered as one trace, requiring shared parent and everything happening under this parent. But all of this consist from multiple files, entities as actionCreator, sagaWatcher(s) and redux reducer. According to your approach all this needs to be pushed into single function, but it is not possible.

Flarna commented 1 year ago

I would assume that a button click is signaled by some onClick callback somewhere. And this would the function you are looking for. This function might trigger a lot async stuff spread across a lot functions. And these async task may even trigger more.

Maybe this onClick function is what you are looking for?

I'm not a react users so I might be wrong. But someone needs to tell OTel

  1. when a span starts
  2. which async operations belong to this span and which one do not (currently scoped by the callback passed to with)
  3. when a span ends

Point (2) can be done either by relying on ContextManager or by explicit passing a context around manually.

gavrix commented 1 year ago

There are use cases where wrapper-based APIs simply cannot be used no matter what. React component life cycle is a good example.

For context: react component is described by a render-function, which is called many times by react. Consider this component which relies on certain resource to be loaded asynchronously during render. We want to:

function Component() {
  // mark span started when the component renders first time
  const resource = useResource()
  // mark span end when resource !== null && resource.status === 'ready'
  return <JSX resource={resource} />
}

function useResource() {
 // start span nested in active context (context of rendering component) 
  fetchResource(() => {
    // end span when resource is loaded
  })
}

The way react works is it will call Component as a function multiple times, every time when resource acquisition status changes. There is simply no way to express this process as a single function for context API to wrap. The way to achieve this seems to pass the parent span around into every function which might create a child span, which is obviously a terrible approach.

Is there any other way which I am missing?

Flarna commented 1 year ago

Could you please mark the places in above code where a specific span should be set as active and where should it be set as inactive?

gavrix commented 1 year ago

I'd have to flesh out some implementation details then. Here's how I envision this to work.

function Component() {
 /* 
  `componentRenderSpan`  is created only once, hence useMemo
  Upon creation `componentRenderSpan` becomes active
 */
  const componentRenderSpan = useMemo(() => api.startActiveSpan('Component render'))
  const resource = useResource()
  if (resource !== null && resource.status === 'ready') {
    componentRenderSpan.endSpan() // this deactivates `componentRenderSpan`
  }
  return <JSX resource={resource} />
}

function useResource() {
  const [resourceState, setResourceState] = useState({status: 'loading'})
  useEffect(() => { 
    const resourceSpan = api.startSpan('Resource') // resourceSpan automatically becomes nested in componentRenderSpan which is currently active
    fetchResource(() => {
      resourceSpan.endSpan() // 
      setResourceState({status: 'ready'})
    })    
  })
  return resourceState
}
Flarna commented 1 year ago

Is this all sync? activating a context/span requires a deactivate on the sync code path before leaving area/tick where the corresponding work is done otherwise this potentially leaks into other, unrelated async work. The context manager is responsible to forward this context into any async operation created when a context is active.

Above code looks a bit like a context is activate for the complete lifetime of a component span. Is it guaranteed that only one component at a time is active? if not, which span should be the active one?

To me this sounds a bit like the react framework (or some react instrumentation) should take care of context activation/deactivation instead for components and potentially even create spans for this.

ivan-nejezchleb commented 1 year ago

Imho potentially multiple spans can be active in JS app - multiple components may load corresponding data from server via resource fetching. It certainly needs some form of context propagation and spans hierarchy management. May be the react context could be used for that

gavrix commented 1 year ago

To me this sounds a bit like the react framework (or some react instrumentation) should take care of context activation/deactivation instead for components and potentially even create spans for this.

Is there an API right now which would allow to manually activating/deactivating contexts? That was the original question I think, how to make context switching without providing a callback function.

Flarna commented 1 year ago

There is no such API. Above samples show why: It allows to activate a context but do not deactivate it on the sync code path which results in very hard to track context leaks.

gavrix commented 1 year ago

So we back to square one. Does this mean oTel is by design limited and by design can not be used to instrument react render process in client applications?

ivan-nejezchleb commented 1 year ago

@gavrix I dont think so. If your use case is just life cycle of single component, then you can keep active span in context of this component (as ending it inside callback is your responsibility). So you can intentionally NOT end it in callback and instead keep it in context of component. Then you can use it in multiple hooks and component functions as active span (using OTL API .with again without ending the span in callback) and then end it manually later during component destruction.

Flarna commented 1 year ago

There is a difference between the point in time where startSpan/endSpan is called and when a context is activated/deactivated.

spans are started when some "operations" begins and ended when it ends. Don't care if it is sync or async. spans on itself are never set as active - it's a context which is set as active.

A context may or may not hold a span. It may hold also other items like baggage,... a context is more or less an immutable map.

The parent/child relationship between spans is done via context. tracer.startSpan() doesn't care about where a given context comes from. It can be passed as argument or it can be the current active context.

Setting a context active is global. There is always exactly one context active (might be the ROOT_CONTEXT which is empty). As activate/deactivate is global it's quite clear that it needs to be scoped.

For example if you want to signal that some piece of code runs "in context of some span" it's needed to set it active exactly for this synchronous code flow to avoid that other, unrelated ticks running afterward think they belong to this span.

The with() API ensures this scoping. I agree that technically an enter/exit would also work but the burden to correctly match them doesn't go away.

The ContextManager component is responsible to propagate a context across as async boundaries. e.g. if you call process.nextTick(someCb) at a time contextWithSpanA is active the context manager takes care that this context is set as active before someCb is called and inactive after it returns.

As said above I'm not an react expert and I don't know how react decides internally when to call what and which parts are sync/async. But I'm quite sure there are places within react framework where they know that they start executing code for some component and when this is done. This is the place where context activation/deactivation should happen.

There is the AsyncResource class in node.js which could be used for this. But again this class has only a runInAsyncScope(cb) API like we have it here - for exactly the same reason.

ivan-nejezchleb commented 1 year ago

@Flarna Mentally I dont have problem with with() API to call something in context of active span. I have "problem" with requirement to end the span as last step in callback. This prevents to spread one logical action, belonging to single span, cross multiple places in code base

Flarna commented 1 year ago

There is no need to end a span as last step of .with().

tracer.startActiveSpan("root", (rootSpan) => {
    const s1 = tracer.startSpan("s1");
    const ctx1 = api.trace.setSpan(api.ROOT_CONTEXT, s1);

    const s2 = tracer.startActiveSpan("s2", (s) => {
        tracer.startSpan("s2.1").end();
        return s;
    });
    const ctx2 = api.trace.setSpan(api.ROOT_CONTEXT, s2);

    api.context.with(ctx1, () => {
        tracer.startSpan("s1.1").end();
    });

    api.context.with(ctx2, () => {
        tracer.startSpan("s2.2").end();
    });

    api.context.with(ctx1, () => {
        tracer.startSpan("s1.2").end();
    });

    s1.end();
    s2.end();
    rootSpan.end();
});

creates following hierarchy:

root
    s1
        s1.1
        s1.2
    s2
       s2.1
       s2.2

Clearly a real world application will not have all this in one function and sync. But this is the area where ContextManager and AsyncResource comes handy to take care to propagate context instances across async boundaries.

mmeyer1 commented 1 year ago

This is a good discussion and exactly what I'm struggling with right now - enterprise wants an enterprise wide react library to "do OTEL" but they don't want the teams owning the front ends interacting with the OTEL API - the last comment was very helpful. It's really tricky to abstract a bunch of React functional components :) I've been experimenting with using a higher order functional component to start a span (ending it is a bit tricky in this scenario though).

reith commented 1 year ago

The ContextManager component is responsible to propagate a context across as async boundaries. e.g. if you call process.nextTick(someCb) at a time contextWithSpanA is active the context manager takes care that this context is set as active before someCb is called and inactive after it returns.

@Flarna And why it's the correct implementation? The document says:

A Context is a propagation mechanism which carries execution-scoped values across API boundaries and between logically associated [execution units](https://opentelemetry.io/docs/specs/otel/glossary/#execution-unit). Cross-cutting concerns access their data in-process using the same shared Context object.

I produced a sample case where executionAsyncId and triggerAsyncId are different, and I'm using default AsyncLocalStore context manager, but the same context is getting used. Is this expected? My understanding is I should get a different context activated wherever triggerAsyncId is different.

Update: I created a new issue to keep this one in context.

HunterLarco commented 11 months ago

+1 to the API ask here. As currently written it's true that with protects the user from creating malformed context state, however, the lack of enter/exit context API methods prevents more complicated use cases where the client is responsible for managing context lifecycle.

For example, I have the following code

export const doInSpan = <R>(metadata: {
  tracer: oTelApi.Tracer,
  name: string,
}, callback: () => R): R => {
  const { tracer, name } = metadata;

  return tracer.startActiveSpan(name, (span) => {
    try {
      const response = callback();
      span.setStatus({ code: oTelApi.SpanStatusCode.OK }); 
      return response;
    } catch (error) {
      span.recordException(error);
      span.setStatus({ code: oTelApi.SpanStatusCode.ERROR }); 
      span.setAttribute(
        oTelSemanticConventions.SemanticAttributes.EXCEPTION_ESCAPED,
        true,
      );  
      throw error;
    } finally {
      span.end();
    }   
  }); 
}

This code works great for sychronous code but is poorly implemented for asychronous code. If one used this helper like so

doInSpan({
  tracer: ...,
  name: 'example',
}, async () => {
  await asyncTimeout(1000 /* milliseconds */);
});

Then the span would end when the Promise is created for our async lambda NOT when the promise resolves/rejects. We'd see a span that lasted microseconds not the full second you'd expect.

To fix this we have 2 options:

  1. Create a second doInSpanAsync method which awaits the callback.
  2. Manually manage the context lifecycle via enter/exit methods where we check if the callback return is instanceof Promise (example below)

The second is a lot stronger of an API because it allows doInSpan to correctly manage either case, async or sync code without the developer needing to be aware of using separate API's for each. Implementing this solution currently is not possible, however, with enter/exit methods I would expect it to look something like:

export const doInSpan = <R>(metadata: {
  tracer: oTelApi.Tracer,
  name: string,
}, callback: () => R): R => {
  const { tracer, name } = metadata;

  const parentContext = api.context.active();
  const span = tracer.startSpan(name, {}, parentContext);
  const contextWithSpanSet = api.trace.setSpan(parentContext, span);

  const exitContext = api.context.enter(contextWithSpanSet, span);

  const recordError = (error: Error) => {
    span.recordException(error);
    span.setStatus({ code: oTelApi.SpanStatusCode.ERROR }); 
    span.setAttribute(
      oTelSemanticConventions.SemanticAttributes.EXCEPTION_ESCAPED,
      true,
    );  
  }

  const exitSpan = () => {
    span.end();
    exitContext();
  }

  const executeCallback = (): R => {
    try {
      return callback();
    } catch (error) {
      recordError(error);
      exitSpan();
      throw error;
    }   
  }

  const response = executeCallback();
  if (response instanceof Promise) {
    response.then(() => {
      span.setStatus({ code: oTelApi.SpanStatusCode.OK });
    });
    response.catch(recordError);
    response.finally(exitSpan);
  } else {
    span.setStatus({ code: oTelApi.SpanStatusCode.OK });
    exitSpan();
  }
  return response;
}
Flarna commented 11 months ago

If I understand you proposed correct api.context.enter(contextWithSpanSet, span) picks up the current active context and the returned function sets this as active once called.

Are you sure that restoring this context is always the correct/expected one after the async operation has ended?

There are two cases where you call exitContext():

Also the active context directly after calling your proposed doInSpan() function would the that one with the newly created span. This is wrong and shows exactly why with() is the better choice here as it does the right scoping of activating the scope for the callback.

Consider following code:

tracer.startActiveSpan('span-a', async (spanA) => {
  // active context shows span-a => ok
  const pb = doInSpan({ tracer, "span-b" }, () => {
    // active context shows span-b here => ok
   return timersPromises.setTimeout(1000);
  });
  // active context shows span-b => not ok
  const pc = doInSpan({ tracer, "span-c" }, () => {
    // active context shows span-c here => ok
   return timersPromises.setTimeout(100);
  });
  // active context shows span-c => not ok
  // or do we even have a timing dependent context here?
  await Promise.all([pb, pc]);
  // what is the active context here?
  // or do we even have a timing dependent context here?
});
Ugzuzg commented 11 months ago

@HunterLarco, your original doInSpan doesn't await the promise. It is expected that the span ends right away. The result you want can be achieved using the following code, I think:

export const doInSpan = <R>(metadata: {
  tracer: oTelApi.Tracer,
  name: string,
}, callback: () => R): R => {
  const { tracer, name } = metadata;

  return tracer.startActiveSpan(name, (span) => {
    const handleSuccess = (response) => {
      span.setStatus({ code: oTelApi.SpanStatusCode.OK });
      span.end();
      return response;
    };

    const catchError = (error) => {
      span.recordException(error);
      span.setStatus({ code: oTelApi.SpanStatusCode.ERROR }); 
      span.setAttribute(
        oTelSemanticConventions.SemanticAttributes.EXCEPTION_ESCAPED,
        true,
      );
      span.end()
      throw error;
    };

    try {
      const response = callback();
      if (response instanceof Promise) {
        return response.then(handleSuccess, catchError);
      }
      return handleSuccess(response);
    } catch (error) {
      catchError(error);
    }
  }); 
}
pkuphy commented 11 months ago

I want to share my use case and solution.

I use XMLHttpRequestInstrumentation to automatically trace XHR requests. At the same time, I manually trace user click events in the code. I want to associate the requests sent by the page after the user clicks with the click event. I searched for a long time but couldn't find a very elegant solution.

Later, I thought of a potential hack:

First, make the span created during the click globally visible:

const onClick = () => {
  const span = tracerForBehavior.startSpan(...)
  window.behaviorScope = span;
  ...
}

Config XMLHttpRequestInstrumentation, set function applyCustomAttributesOnSpan, which will be executed when the request ends.

const applyCustomAttributesOnSpan = (span, xhr) => {
  const behaviorScopeInfo = window.behaviorScope?.spanContext();
  if (behaviorScopeInfo) {
    const { traceId } = behaviorScopeInfo;
    span.setAttributes({
      'behavior.trace_id': traceId,
    });
  }
};

registerInstrumentations({
  instrumentations: [
    new XMLHttpRequestInstrumentation({
      applyCustomAttributesOnSpan: applyCustomAttributesOnSpan,
    }),
  ]
});

Later, when processing the logs, use the behavior.trace_id from the attributes as the trace_id.

The result fits my demand well.

Screenshot-2023-10-13_10_46_21

I hope this hack provides some inspiration for you.

lukiano commented 11 months ago

Just stumbled upon this issue while looking for something else, and I was wondering if the using keyword that typescript 5.2 is introducing based on https://github.com/tc39/proposal-explicit-resource-management would help avoiding wrapping functions.

Flarna commented 11 months ago

Yes, I think using is likely a good fit here. using is scope bound and it doesn't allow e.g. to set a context nested but reset in a different order or at a complete unrelated place.

airhorns commented 11 months ago

+1 for adding a lower-level, "I know what I am doing" API. I've come across two use cases in super commonly used frameworks in node: fastify and jest. Both have a way to wrap units of execution, but they only have callbacks for a before phase and then a different callback for an after phase. There's no way in either system to wrap the whole unit of execution.

In fastify, it's request hooks:

// the real API, exists
fastify.addHook("onRequest", async () => {
  // runs before request processing starts for every request
  // request processing only resumes after this async function resolves
});

fastify.addHook("onResponse", async () => {
  // runs after request processing has finished for most requests
});

fastify.addHook("onTimeout", async () => {
  // runs after request processing has finished for requests that timeout
});

There isn't an api that works with context.with, like this:

// doesn't exist
fastify.addWrapperHook(async (run) => {
   await context.with(..., run)
});

Same with jest, there are hooks for doing things at the start of a test and at the end:

beforeEach(() => {
  // runs before every test
});

afterEach(() => {
  // runs after every test
});

There is no opportunity to wrap the whole test execution, including other beforeEach/beforeAll hooks, this api doesn't exist:

// doesn't exist
aroundEach((run) => {
  context.with(..., run);
});

I get that the .with API is vastly preferable as it allows the authors of this package to bake in all the smarts they can around cleanup and context propagation and whatnot. But, the decision to completely disallow other authors the same responsibilities is restrictive for those brave enough to venture in. I for one value pragmatism -- make it clear how I am supposed to do something, but don't make it impossible to do it a different way if I have a week to burn on making it work well. Documenting that an API is for low-level, dangerous use only seems like a fine way to appease the numerous people in this thread with real use cases while not encouraging all users to deploy an antipattern.

airhorns commented 11 months ago

Also, the nodejs primitive atop which context management is done by default supports the imperative style API itself: https://nodejs.org/api/async_context.html#asynclocalstorageenterwithstore

Flarna commented 10 months ago

The APis in node are experimental. So if they are added in OTel API I recommend to add them also experimental there to have the opportunity to change/remove them at any time.

It's unlikely that enter/exit leave experimental status in node.js because the TC39 proposal for AsyncContext doesn't have these APIs either and target is to put ALS on top of AsyncContext at some time.

Also JS other runtime candidates like e.g. cloudflare workers have no enter/exit even they took node.js as prototype (because of the TC39 proposal).

Maybe frameworks like fastify/jest adapt over time to be better compatible with this TC39 proposal.

airhorns commented 10 months ago

I think an experimental API would be great! I think the folks in this thread are looking for a way forward, not a super-duper robust, reliable thing, especially if the standards bodies that be haven't decided what the best way is. That proposal is still stage 2 though so it may change as well -- another good reason to mark this as experimental.

That said, folks are using jest, fastify etc today, so I think it'd be pragmatic to add an API today, especially if there is precedent for making breaking changes to experimental things in non-major versions. I'd be fine with an api that threw if the underlying platform didn't support the feature. jest and fastify aren't really commonly used in cloudflare worker code as far as I know.

Abion47 commented 8 months ago

Throwing my two cents into this issue with my own use case.

I'm creating a centralized logging and tracing library for a backend overhaul using Winston as a base. The intention was for the base library be target-agnostic and to implement destination targets (such as OTel) as sister transport libraries. One feature of the library was the ability to spin off child instances/contexts from the global instance/context, which would be implemented in the various transports as transactions or spans where appropriate:

(very simplified overview)

class Logger {
  private transports: TransportWrapper[];

  ...

  createChildContext(...) {
    const childTransports = [];
    for (const transport of this.transports) {
      childTransports.push(transport.createChild(...);
    }
    return Logger.child(childTransports);
  }

  close() {
    for (const transport of this.transports) {
      transport.close();
    }
  }
}

As far as I can tell, the requirement of OTel to implement active context and spans using callbacks renders this approach virtually impossible, at least without some ill-advised voodoo. Any callback function I pass to OpenTelemetryTransportWrapper.createChild wouldn't have to end before any child transport returned from the function would be usable, defeating the process.

For reference, this is what I would expect to be able to do:

class OpenTelemetryTransportWrapper extends TransportWrapper {
  createChild(...) {
    ...

    const spanContext = {
      // Manually create the new span context
    };

    const span = tracer.startSpan(name, attributes, spanContext);

    return new OpenTelemetryTransportWrapper(..., span);
  }

  close() {
    this.span.end();
  }

  ...

  // All of the relevant logging functions that extract metadata and adds it to the span represented 
  // by the current transport instance
}
const logger = new Logger([
  new OpenTelemetryTransportWrapper(...),
  // other transports added here
]);

...

async function someInnerTask() {
  // Automatically creates spans or transactions in OTel, Sentry, etc. depending on transports included
  const localLogger = logger.createChild(...); 

  // Logging and context data is routed to the various transports, including providing ongoing metadata 
  // to the various spans/transactions
  localLogger.info('Some relevant data', { contextData: 'foo' });

  // Do some work and get some data
  const result = await task(...);

  // Closes/ends all relevant spans/transactions
  localLogger.close(); 
}

(For the record, that's not how I would actually use the createChild functionality. This is just an over-simplified example.)

If I was using the default instrumentation libraries that OTel provides, it makes sense that the approach for managing contexts and spans is heavily opinionated. But if I'm trying to use the base API to create my own solution, IMO it's not the library's place to tell me how I am and am not allowed to do it. If I want to be able to manually manage contexts and spans, I should be able to. And if that capability comes with the real risk of shooting myself in the foot, that's the danger that I accepted when choosing to go the manual route in the first place.

Yes, callbacks are still around, and yes they have a variety of legitimate use cases and advantages. But they also have a variety of disadvantages, and callback hell is just as much of a thing now as it was 20+ years ago. This issue might not be directly related to the whole async/await scenario, but the reasons that people have migrated en masse away from using the callback promise APIs are also relevant here. People can have good reasons to want to use them, but they can also have good reasons to not want to use them, and a base API library shouldn't be in the business of taking sides on these kinds of implementation details.

rilpires commented 7 months ago

@HunterLarco, your original doInSpan doesn't await the promise. It is expected that the span ends right away. The result you want can be achieved using the following code, I think:

I swear my code is exactly as yours. And it works, just for one scope. When I try to reuse this logic in other parts, somehow, they end being a child of an older span and I have no idea why. Here is my code:

async function initializationCode() {
  for (promise of initializionPromises) {
    await wrappedOnstartActiveSpan(promise);
  }
}

startActiveSpan('initialization',
  async (span) => {
    await initializationCode();
    span.end();
  }
);

// server is running fine and spans were properly
// grouped inside the initialization span

// user requests by API. This is my middleware handling it:
startActiveSpan('user_request',
  async (span) => {
    await doStuff()
    span.end();
  }
);
// Somehow, the user_request is being grouped sometimes inside the
// initialization span, or inside one of the initialization spans

Is there something I'm missing? Should I explicitly "clear" a context somehow? my "wrappedOnStartActiveSpan" is almost the same as @HunterLarco , but I'm not passing the tracer around but instantiating a new one with the api.trace.getTracer(...)

alaingiller commented 5 months ago

I've the same problem, I'm using Angular and thus cannot use callback. My workaround / hack is to implement my own ContextManager, and "misuse" then bind() method to set a global active context. In my case I implemented a "first win" strategy, but this can be adapted quite easily.

export class RootContextManager extends StackContextManager {
    /**
     * If the current span is terminated (span.end() was called), reset the context to ROOT_CONTEXT
     */
    override active() {
        const span = api.trace.getSpan(this._currentContext);
        // If the current span is terminated (span.end() was called), reset the context to ROOT_CONTEXT
        if (span?.isRecording() === false) {
            this._currentContext = api.ROOT_CONTEXT;
        }
        return super.active();
    }

    override bind<T>(context: api.Context, target: T): T {
        const span = api.trace.getActiveSpan(); //getSpan(this._currentContext);
        // only bind the context if there is no recording active span. First win, it can be only have one active span.
        if (!span || !span.isRecording()) {
            this._currentContext = context;
        } else {
            const activeSpanName = (span as any).name;
            const newSpanName = (api.trace.getSpan(context) as any)?.name;
            api.diag.info(
                `There is already an open active span: '${activeSpanName}' -> '${newSpanName}' will not be used as parent span`
            );
        }
        return super.bind(context, target);
    }
}
const tracer = api.trace.getTracer('MyTracer');
const span = tracer.startSpan(operationName);

// set span as global current span (if there is currently no current span)
const ctx = api.trace.setSpan(api.context.active(), span);
api.context.bind(ctx, null);
KaydenMiller commented 2 months ago

I have been trying to do some research into this, found this thread. Since the existing span implementation doesn't nativly support disposable via the symbols that were mentioned earlier by @lukiano I was looking into trying to create a wrapper class that just acts as a pass through for the regular otel span.

export class DisposableSpan implements Disposable {
  private currentSpan: Span

  constructor(
    private name: string,
    private serviceName: string,
    private options?: SpanOptions
  ) {
    this.currentSpan = trace
      .getTracer(serviceName)
      .startSpan(this.name, this.options ?? {}, context.active())
    const ctx = trace.setSpan(context.active(), this.currentSpan)
    context.bind(ctx, null)
  }

  /*
       ... SPAN WRAPPER FUNCTIONS HERE
  */

  [Symbol.dispose](): void {
    this.currentSpan.end()
  }
}

however, this doesn't mark the span created by the object as active during the scope of the using block. I'm assuming this is due to the fact that bind isn't intended to be used this way and since there isn't a function target to be bound I can't do it this way. From what I understand from this thread there currently is not an exposed way to start a span as active without also providing the function target. Is that correct?

Ideally i would like to be able to write code like:

function fnNameHere() {
  using span = new DisposableSpan(); // or some other way of creating the span, like a factory
  // ... do stuff here
  // span is cleaned up at end of block scope
}