getsentry / sentry-javascript

Official Sentry SDKs for JavaScript
https://sentry.io
MIT License
7.97k stars 1.57k forks source link

Request isolation is broken when using Sentry v8 with existing OTel setup #13055

Closed nwalters512 closed 2 weeks ago

nwalters512 commented 3 months ago

Is there an existing issue for this?

How do you use Sentry?

Sentry Saas (sentry.io)

Which SDK are you using?

@sentry/node

SDK Version

8.20.0

Framework Version

N/A

Link to Sentry event

N/A

SDK Setup/Reproduction Example

https://github.com/nwalters512/sentry-memory-leak-repro

This example uses a minimal Sentry + OTel setup, with code/structure being copied directly from the Sentry + OTel documentation.

Steps to Reproduce

  1. Clone the reproduction repository
  2. Run npm install
  3. Run npm run start
  4. Run curl http://localhost:3000/ a few times

Expected Result

I would expect the log from each request to show just a single event processor. This would indicate that event processors aren't leaking between requests and thus that request isolation is working correctly.

Actual Result

Every time a request comes in, another event processor is added to the existing scope, indicating that the same scope is being shared among all events. In practice, this will lead to a memory leak, since the array of event processors will grow without bounds.

mydea commented 3 months ago

Hey,

so your setup has a couple of issues.

You write:

This example uses a minimal Sentry + OTel setup, with code/structure being copied directly from the Sentry + OTel documentation.

What docs did you follow to set this up? I am just asking because the docs I am aware of about a custom otel setup: https://docs.sentry.io/platforms/javascript/guides/node/tracing/instrumentation/opentelemetry/#using-a-custom-opentelemetry-setup explain a quite different setup:

  1. You need to configure skipOpenTelemetrySetup: true in your Sentry.init({})
  2. You are missing the sentry sampler, span processor, propagator, and context manager
  3. It also seems to miss a dsn? (this should not break stuff, but just wondering)

When your setup is correct, can you try again?

nwalters512 commented 3 months ago

What docs did you follow to set this up?

For OTel, I followed the official "getting started" documentation: https://opentelemetry.io/docs/languages/js/getting-started/nodejs/

For Sentry, I followed the "ESM (MJS)" docs under "installation methods": https://docs.sentry.io/platforms/javascript/guides/node/install/esm/.

I am just asking because the docs I am aware of about a custom otel setup:

Hmm, honestly, I didn't know those docs existed, but even then, I wouldn't say this is a "custom" OTel setup. This is a bog-standard setup from the official "getting started" documentation. I see the actual page says "If you already have OpenTelemetry set up yourself, you can also use your existing setup", which I guess applies here. But if that's the case, I would expect this to be part of the top-level setup instructions, not buried under "Set Up Tracing" (I don't use tracing, so this doesn't apply to me). Even the docs for @sentry/opentelemetry state "This package allows you to send your OpenTelemetry trace data to Sentry via OpenTelemetry SpanProcessors" - it makes no mention of this being necessary for the OG basic Sentry features to work correctly.

All that said: adding just contextManager: new Sentry.SentryContextManager() to the NodeSDK constructor options seems to have fixed the context isolation issue.

From looking at the wrapContextManagerClass function, I can understand on a technical level why a custom context manager seems to be required: you're trying to provider interop between Sentry's bespoke "scope" API and the official OTel "contex" API, which is an admirable goal. If I can offer up my 2 cents: your current documentation isn't upfront enough about the fact that the Sentry SDK is now a fancy OTel wrapper and that you need a very specific OTel setup to function correctly. Not just for traces, as your documentation implies, but for all SDK functionality. It would also be very valuable to explain why these additional pieces are required, as opposed to just asserting that they are. For instance: it's not clear to me if/why I'd need a custom span processor, sampler, or propagator if I'm not using Sentry tracing.

nwalters512 commented 3 months ago

Additional feedback from trying to get things working:

So, it seems like only SentryContextManager is required if one wants to use Sentry for error reporting but not tracing. This contradicts what validateOpenTelemetrySetup() claims:

https://github.com/getsentry/sentry-javascript/blob/a880712fbb09c6e46888cf90f47c7054d32a933c/packages/node/src/sdk/index.ts#L198-L205

mydea commented 3 months ago

Hey,

yeah we are aware that our docs on this are not ideal and a bit buried as of now - we are currently reworking these docs here: https://github.com/getsentry/sentry-docs/pull/10872 where all of this should become more prominent and clear, hopefully.

Generally, also the other parts (sampler, propagator) are required for everything to work properly, even if you do not use tracing with Sentry. This is because we depend on this for trace propagation etc. to work as expected.

The span processor is actually not needed when you do not use sentry for tracing - I will adjust the validation logic to reflect this: https://github.com/getsentry/sentry-javascript/pull/13079

nwalters512 commented 3 months ago

It's great to hear that improved docs are on the way!

Generally, also the other parts (sampler, propagator) are required for everything to work properly, even if you do not use tracing with Sentry. This is because we depend on this for trace propagation etc. to work as expected.

This sentence doesn't make a whole lot of sense to me. Why do I care about Sentry's custom trace propagation if I'm not using Sentry tracing?

chargome commented 3 months ago

Hey @nwalters512,

This sentence doesn't make a whole lot of sense to me. Why do I care about Sentry's custom trace propagation if I'm not using Sentry tracing?

Trace propagation is still needed for automatically connecting different services, e.g. to find related errors in frontend + backend. So this applies even if you're not using Sentry for monitoring performance.

nwalters512 commented 3 months ago

Interesting! We don't use Sentry in a way that spans multiple services or frontend/backend, but that's good to know. This is exactly the kind of thing I'd hope to see in the updated documentation.

chargome commented 3 months ago

Feel free to read over the updates and leave feedback: https://github.com/getsentry/sentry-docs/pull/10872

getsantry[bot] commented 3 weeks ago

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you remove the label Waiting for: Community, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀