Closed jeengbe closed 4 months ago
Hey, thank you for the feedback!
You are right that this was done to avoid leaking memory. Your use case is interesting, we should def. support this! We'll introduce a config option for this. I think I'd go with a maxSpanDuration
config option which defaults to 5m, which you could overwrite to a value of your choosing, so you could set this to a really high value like 24h if you want. You should never fully disable this because you can always get orphaned spans (for whatever reason, e.g. some parent span is dropped somewhere) which would never be sent/cleaned up, but a high value could be selected nevertheless!
For the reason why also shorter children are dropped: This is because every children waits for it's own parent before it can be sent. So if the parent is not there after 5min, the child is dropped (even if it is completed). This is not ideal and hopefully at some point we can improve this, but this will require us to be able to send spans on their own (not just in a transaction event), which we are working on but cannot do today!
Your use case is interesting
Out of interest; how is this a rate/interesting case? Sentry even has a Cron monitoring product, so instrumenting a batch job surely can't be that unusual I'd think.
Sorry, what I meant to say is, your use case makes sense and we should support it properly! It's always a bit of a tradeoff with the cleanup duration, as it depends on your setup how many spans are emitted/orphaned/need to be cleaned up etc 🤔 so we'll need to see what an ideal default value is (maybe it is higher than 5min), but def. a good first step is to make it configurable at all!
Is there an existing issue for this?
How do you use Sentry?
Sentry Saas (sentry.io)
Which SDK are you using?
@sentry/node
SDK Version
8.9.2
Framework Version
No response
Link to Sentry event
https://prisjakt-ab.sentry.io/performance/trace/962f9a6f5f264f8986da7d9ace318b6f/
SDK Setup
No response
Steps to Reproduce
Due to how https://github.com/getsentry/sentry-javascript/blob/1201eb2d78425709293ee0aac180cbcc1f4e99d2/packages/opentelemetry/src/spanExporter.ts#L53-L58 only flushes spans once a root span has finished, any child spans for an operation that takes longer than 5 minutes (e.g. a cronjob) are dropped.
This is what the trace used to look like: https://prisjakt-ab.sentry.io/performance/trace/890f3ee3731e41119416f3964e90719f, this is now: https://prisjakt-ab.sentry.io/performance/trace/962f9a6f5f264f8986da7d9ace318b6f
I understand that this is to avoid leaking memory (makes sense, I guess), but it would be nice to perhaps configure this limit and disable it entirely. Clearing old spans to avoid memory leak is not really an issue for batch operations that only consist of a single root span and terminate once it's completed anyway.
Furthermore, because of how https://github.com/getsentry/sentry-javascript/blob/1201eb2d78425709293ee0aac180cbcc1f4e99d2/packages/opentelemetry/src/spanExporter.ts#L174-L177 works, any spans longer than 5 minutes are automatically discarded. Fortunately, the root span is still sent because cleanup is never attempted for a finished root span.
Expected Result
a
Actual Result
a