Closed derekrprice closed 1 year ago
Presumably, I would want to wrap calls in Sentry.withScope()
, but it doesn't look like that can handle an async callback, so that won't work here.
Every Azure Function invocation is passed a unique "invocation ID" which you could probably use to index a map of scope stacks but the current Sentry SDK implementation's single global scope stack isn't going to cut it here.
Not the most elegant code, but so far I haven't been able to come up with anything better within the limitations of JS. Something like this should work to replace the SDK's existing withScope. I do also need it to be async and return its caller's result because I need to be able to propagate any exceptions thrown inside the callback.
Basically, this maintains the scope map index as a dynamic function name - you would have to replace the SDK's current global stack with a global map, but that shouldn't be complicated. You wouldn't even need to use the function ID I suggested earlier - each call to withScope can simply supply its own retrievable unique key. My key generation is simplistic, but I think it should be sufficient for this use case. This even works when nested:
function getScopeKey() {
const e = new Error();
const key = e.stack.match(/(?<=as sentryScopeKeyIs:)(?:\d+)/);
return key?.[0];
}
let uniqueKey = 1;
async function withScope (callback) {
const id = uniqueKey++;
const functionName = "as sentryScopeKeyIs:" + id;
x[functionName] = function(callback) {
return callback();
}
return x[functionName](callback);
}
// One level.
withScope(getScopeKey).then(scopeKey => console.log(scopeKey));
// Nested scopes.
withScope(async () => {
return [
getScopeKey(),
await withScope(getScopeKey)
];
}).then(scopeKey => console.log(scopeKey));
This is tested in chrome, btw. Hopefully it works as well with node's chromium engine. Other browsers may need some sort of polyfill.
Bump. Thought about this some more. Obviously having withScope become async would be a breaking change, but that isn't the most important part of this fix and I think if I passed withScope an async function, withScope itself wouldn't need to be async as long as it returned the result of the callback since I can await the returned promise then.
Anyhow, the rest of this is a really important change. Any chance of it getting into a release soon?
Here is the revised code. Looks like I had left the dummy "x" object in my previous code global too, which is unneeded:
function getScopeKey() {
const e = new Error();
const key = e.stack.match(/(?<=as sentryScopeKeyIs:)(?:\d+)/);
return key?.[0];
}
let uniqueKey = 1;
function withScope (callback) {
const id = uniqueKey++;
const functionName = "as sentryScopeKeyIs:" + id;
const x = {
[functionName]: function(callback) {
return callback();
}
};
return x[functionName](callback);
}
async function doApplicationStuff() {
return getScopeKey();
}
// One level.
withScope(doApplicationStuff).then(scopeKey => console.log(scopeKey));
// Nested scopes.
withScope(async () => {
return [
getScopeKey(),
withScope(doApplicationStuff)
];
}).then(scopeKey => console.log(scopeKey));
Have you tried to play around with manually pushing/popping scopes? This is how we do it in the AWSLabda case - https://github.com/getsentry/sentry-javascript/blob/master/packages/serverless/src/awslambda.ts#L267-L287
If it won't cut it, then it'll have to wait until we rework how integrations work, as using getCurrentHub
doesn't play nicely with multiple client instances or with async code. This, however, requires a lot of internal refactoring, which will take some time until completed.
Pushing and popping scopes does not work with async code. What is wrong with my suggested replacements for withScope and getScope? I believe it would be a non-breaking change.
Any particular reason pushing and popping scopes doesn't work for you? I was in a similar situation and was tearing my hair off trying to find a solution to how to keep the Sentry context tied to a request in a web service environment. Passing Koa's context
object around is not an option in real applications, so I don't know why the usage of push/pop scopes is so poorly documented when it comes to Sentry tracing.
Anyhow, this solved it for me:
// Mostly code from https://docs.sentry.io/platforms/node/guides/koa/
const transaction = sentry.startTransaction({
name: `${reqMethod} ${reqUrl}`,
op: 'http.server'
});
const hub = sentry.getCurrentHub();
try {
// Push the scope
const scope = hub.pushScope();
scope.setSpan(transaction);
context.res.on('finish', () => {
transaction.setHttpStatus(context.status);
transaction.finish();
});
await next();
} finally {
// ... some other cleanup and finishing the transaction
hub.popScope();
}
@richardsimko: Because we have dozens, sometimes hundreds of threads running in parallel (yes, I understand that JS threads are virtual, but the end result is the same). So, every time a process awaits IO, another thread might pick up control. Actual flow ends up looking like:
thread 1: const scope = hub.pushScope();
thread 1: scope.setSpan(transaction);
thread 1: // ... do stuff that might await IO
thread 2: const scope = hub.pushScope();
thread 2: scope.setSpan(transaction);
thread 2: // ... do stuff that might await IO
thread 1: // do stuff that logs to thread 2's scope & transaction
thread 1: // ... do stuff that might await IO
thread 3: // ... and so on...
In practice, we have an Azure Functions app that receives dozens of queued requests in waves, and the waves sometimes come in quick succession. I believe that each listener is configured to receive up to 32 queued messages at a time and about a dozen listeners are running in parallel, listening to different queues. We end up seeing all the transactions from a wave attached to whichever was the last to call hub.pushScope() and register its transaction, and often transactions are even registered to the wrong app, since the individual application listeners are also running in parallel and pulling 32 messages each. I've seen HTTP calls and DB queries from what looks like over 100 discrete function invocations attached to a single function invocation, with the other all invocations registering as not having logged any HTTP or DB requests to Sentry.
Since I don't know the details of your implementation I can't say for sure, but to me this sounds very similar to how a Node web server works. It also handles concurrent (Of course concurrent-ish, since Node is single-threaded as you say) requests and while waiting for slow async tasks it processes other things. This is exactly the scenario we're experiencing and from my understanding the scenario push/pop scope is intended to solve when reading the docs:
The default SDK integrations will push and pop scopes intelligently. For instance web framework integrations will create and destroy scopes around your routes or controllers. https://docs.sentry.io/platforms/javascript/enriching-events/scopes/#whats-a-scope-whats-a-hub
(Although I disagree on the "intelligently" part, but that's a different discussion)
It is exactly a Node app. Sentry's Node SDK's push and pop is very naïve and does not take into account multiple threads. There is a single shared hub object with a single stack. Push/pop was the first thing I tried. It did not work. Sentry's support and dev teams are aware of the issue but have not issued an ETA on providing a fix. From my discussions with them, they seem to think that it would require a complete rewrite of their scope handlers but, after poring over their code, I disagree. The problem is challenging, but I documented a workaround and a suggested path forward with sample code above.
There is, unfortunately, no way in JS to make a "context" object available to objects down the call stack without actually passing them as arguments to all intervening functions, which makes this a tough problem to solve in JS. My workaround, which encodes a key into a function name on the call stack and then uses a bogus exception to retrieve it from the stack trace, would work, but I haven't had time to make that into a PR for the Sentry SDK yet. I was hoping one of their developers would be inspired by my test code which proves the concept will work, but that hasn't happened yet. I will keep trying to get time to solve this for them here but other pressures are keeping my focus on other projects at the moment.
@derekrprice I happened to stumble with this, I was having the same issues, I recommend checking the AsyncLocalStorage
API from node, I am pretty sure it will help you.
@richardsimko, check out #4071 for a detailed explanation of why you may not have experienced this issue yet with your web app. Basically, the problem is hidden in low traffic scenarios.
This issue has gone three weeks without activity. In another week, I will close it.
But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog
or Status: In Progress
, I will leave it alone ... forever!
"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀
Don't be stale.
Could this not also be solved the same way the http handler works? i.e. running the wrapped handler in a new domain?
With the 7.48.0
release of the JS SDK we use AsyncLocalStorage
instead of domains
under the hood, which should help with some of these problems.
If you want to manually wrap methods in an async context to isolate breadcrumbs/scope/spans, you can use Sentry.runWithAsyncContext
, which is documented here: https://docs.sentry.io/platforms/node/configuration/async-context/
Package + Version
@sentry/node
6.3.6Version:
Description
Here is my
withSentry
wrapper for Azure serverless Function invocations:Defining an Azure Function like this:
This logs everything correctly for exceptions since the scope is set immediately before sending the log, but performance tracing associates DB children with the wrong span and other context. This appears to be because the DB integrations use the span set on the scope. Since we have a high volume application (on the order of 3400 invocations per minute), many function invocations run concurrently and, since
Sentry.configureScope()
configures the span in a global scope, logged child DB transactions are associated not with the invocation they actually ran during but instead with context of the function invocation that last calledSentry.configureScope()
. In these images, there are dozens of MongoDB calls associated with the _sessions function even though the inserts are actually all called by the parent _timeseries function: