Closed cstickel closed 2 years ago
cc @bcoe
(Thanks for the report!)
As a side note - the best-practice I am aware of it not to run in production with source-map-support (or --enable-source-maps) and instead upload the sourcemaps to your error monitoring tooling. Example with sentry and elastic (since there is an OS offering) - https://www.elastic.co/guide/en/apm/agent/rum-js/current/sourcemap.html https://docs.sentry.io/platforms/javascript/sourcemaps/
I'd be curious to know where the hotspot is in execution, to see if it can be improved...
But performance of the actual generation of stack traces hasn't so far been a concern, because this generally happens in the exceptional flow of an application and not on a hot path like in your example.
If I'm understanding your example code:
http.createServer((request, response) => {
response.writeHead(200);
console.log(new Error().stack);
response.end();
}).listen(3000);
Every single request to the server your benchmarking is generating a remapped stack trace?
Do you see a similar performance hit with large source maps in an example where no stack trace is created?
I know --enable-source-maps is experimental
Source maps are no longer experimental, it's just the case that this paticular performance concern hasn't been analyzed or optimized (edit: if this is an actual use case we should fix it, I would much rather this than push users away from the feature in our docs.)
Might be a good potential first contribution for somone to make (if you're perhaps even interested @cstickel). I would bet that a a CPU profile of your example might point to an obvious step in the algorithm to improve.
(the fact that requests get slower and slower seems to point to a memory leak, which we definitely should fix).
I created the repo node-issue-41541 to test the hypothesis of a memory leak.
Across 40,000
requests with ab
, there does not appear to be a memory leak:
ab -n 40000 -c 12 http://127.0.0.1:3000/
Requests per second: 106.13 [#/sec] (mean)
Time per request: 113.070 [ms] (mean)
Time per request: 9.423 [ms] (mean, across all concurrent requests)
Transfer rate: 7.77 [Kbytes/sec] received
Memory usage:
If I had to guess the main cause of slowdown, it would be that the server is CPU bound (and the main culprit might be the readFileSync
used to load the original source content:
I'm not sure if this is a feature that source-map-support has? So it perhaps doesn't have the need to read anything off disk like this?
I feel like peformance-wise, for what's hopefully only coming up in exceptional cases, this throughput is probably pretty reasonable. But, would happily accept patches to improve CPU performance.
@cstickel if you are able to build Node from source, this branch tests the hypothesis that the call to readFileSync
is the major performance difference.
I noticed ~100% better benchmarks with that branch:
With cache:
Percentage of the requests served within a certain time (ms)
50% 57
66% 59
75% 62
80% 63
90% 69
95% 80
98% 103
99% 106
100% 143 (longest request)
Without cache:
Percentage of the requests served within a certain time (ms)
50% 113
66% 117
75% 119
80% 122
90% 128
95% 137
98% 156
99% 162
100% 250 (longest request)
A bigger discussion to have, is it worth adding the complexity to optimize this path? If your application is throwing 1000s of stack traces, the performance degredation might not be your biggest problem.
@bcoe this optimization seems reasonable though it will need to deal with the case the files changed (maybe by stat
ing the files and using the last modification date as part of the cache or to check if they haven't changed in the import graph / require.cache?)
or to check if they haven't changed in the import graph / require.cache?)
The thing about these files, is that they're likely to be files that were never in the import graph (.ts
, .jsx
). As they're the original files on disk that the source map maps back to, not the in memory representation.
@cstickel you're example is actually a worst case scenario, because the source-map maps back to a file that's 2Mb
on disk, so with each stack trace this gets loaded to show the original source location. In the real-world example of a large combined set of source files with esbuild, this problem will be less pronounced because the source map will remap to the original file on disk (not the large combined file).
maybe by stating the files and using the last modification date
I think this could work, I'd be curious to see if we get signficantly better performance with a sync stat call, than reading the file (I'm betting yes, but might be worth sending a branch to @cstickel to test 😃).
The thing about these files, is that they're likely to be files that were never in the import graph (.ts, .jsx). As they're the original files on disk that the source map maps back to, not the in memory representation.
But they can still change can't they? Wouldn't they be changing in a workflow where I develop .ts
code, transpile it on the fly while developing and using --enable-source-maps
to get good errors.
@bcoe to be clear while I am thankful for the bug report by OP - I think the current behaviour is totally fine and you went above and beyond investigating.
In the past error-cases were considered exceptional and the performance of doing new Error().stack
in a tight loop was never measured with the intent to benchmark as it is not expected this is something people do in performance sensitive code.
I think it might make sense to improve the docs regarding expectations users should have with --enable-source-maps
turned on.
(Caching is probably fine but probably not a priority?)
ah, sorry for the late response, was quite busy at the weekend.
@benjamingr applying the source maps in the monitoring tooling completely makes sense. wasn't aware that some tools can do that. however I'm afraid that a lot of teams just have not the possibility to use the "best-practice", maybe because of legacy stacks or other priorities. in general I think it's a good idea to provide a good simple solution for everyone who is not able to use the best solution (like Advice, when advice is not heeded)
@bcoe the branch performs a lot better, however it's still far from how source-map-support is performing:
http.request_rate: ............................................................. 80/sec
http.requests: ................................................................. 4800
http.codes.200: ................................................................ 4800
http.responses: ................................................................ 4800
http.response_time:
min: ......................................................................... 0
max: ......................................................................... 51
median: ...................................................................... 2
p95: ......................................................................... 4
p99: ......................................................................... 6
native with caching
http.request_rate: ............................................................. 80/sec
http.requests: ................................................................. 4800
http.codes.200: ................................................................ 4800
http.responses: ................................................................ 4800
http.response_time:
min: ......................................................................... 5
max: ......................................................................... 132
median: ...................................................................... 27.9
p95: ......................................................................... 62.2
p99: ......................................................................... 82.3
in general I'm for sure always interested in contributing to open source projects. however i'm not sure if I can handle it in a sane amount of time, because I have no clue about the project structure. but will have a look the next days, if I can wrap my head around. just regarding the discussion of the change of the source-maps. in which cases is that important? at least when performance matters (like in production) it's not likely to change without a full restart. so it might also be possible to provide an additional parameter to control how aggressive the cache is. but that might make it a way to complicated :-D does source-map-support handle those cases properly?
@benjamingr and @bcoe regarding if it's worth to fix/improve The example was of course super constructed, because it should be a minimal reproduction of the problem. In contrast to the expectation of @bcoe the problem is, as said, a lot more pronounced in real world scenarios (at least when it's a single file bundle, haven't tested unbundled projects, but will deliver data for that tomorrow). If everything goes well and the application is perfectly written of course it won't happen at all. However as we know sotware isn't perfectly written, like all software isn't. But for sure it might happen in real screnarios and it did in our current project (luckily in preproduction load test). In our case we had not yet refined generic logging that still logged stack traces in too many cases. We just didn't expect it to harm, especially because in other projects we didn't had the problem while using source-map-support. In that project really just a few 5-10 simultaneous requests that accessed the stack trace took so long that the kubernetes health check failed. But in general, let's just imagine a single uncaught exception that may occure in a temporary situation for all request (maybe because of some external resource is not available) might have a domino effect. And stuff like that happened before for a lot of companies (of course related to other problems). Just saying, there imho is quite a huge risk.
Of course that's only the case if you don't say using --enable-source-maps
or bundling to a single file is a no-go in production anyways. But then the docs should be super clear about that. Like super clear in "Don't use this flag in production." instead of "It might have a performance impact.".
In any case, it just seems to be a regression from source-map-support or is there any reason why the native solution performs that much worse? Are there additional features or bugs fixed that require a lot more processing?
However, I also don't see it as a top priority. For me personally I'll try to get the source maps to the monitoring as suggested by @benjamingr. It really just sounds like a good practice, just wasn't aware of it. In cases where this isn't possible I'm going to still use source-map-support as long as it's maintained, because it's working perfectly fine. It's an additional dependency, but... that's just how it is. Also not bundling to a single file would for me be an option, if that improves the performance, but actually I do like the small build artifacts that can be achieved with the bundling (no overhead from node_modules). Just worried that source-map-support won't be maintained soon, now that native source map support is there.
So would be happy if the issue is down prioritized, but not closed, because I personally do see a good reason why the performance should be improved in the long term :-D
We just didn't expect it to harm, especially because in other projects we didn't had the problem while using source-map-support. In that project really just a few 5-10 simultaneous requests that accessed the stack trace took so long that the kubernetes health check failed.
@cstickel If you could figure out a minimal reproduction that demonstrates behavior closer to this, it would help me decide if it's worth implementing caching (given the complexity it introduces).
I'm not 100% sure why the performance is significantly worse than source-map-support
, but 30ms
responses with 80 concurrent requests is reasonable performance IMO. If, however, we can recreate 5 - 10 simultaneous requests causing seconds of delay? (what are your health checks configured for?). Seems like a strong argument for caching the reading of the original source off disk.
If we can cause the slowdown you're describing, perhaps we can find some other optimizations while we're at it.
thanks that you put so much effort into the topic! i'll create a repo with a minimal reproduction that is more close to a realistic scenario tomorrow morning. and also going to start to go through the code and comparing it with the source-map-support implementation to get an idea which parts could cause the difference.
@bcoe sorry didn't find time yesterday, but there it is: https://github.com/cstickel/node-js-source-maps-performance-test tried to show a scenario that might really happen, but of course it can always be argued that developers should not make such mistakes, but what should i say, something quite similar happend to us 🙃 developers just do mistakes and that one does not even look obviously harmful, just useless, imho.
for me the effect is heavily visible in that example for 20 req/sec. however in a real application it was even a bit worse. think main cause are more dependencies => bigger bundle size than in the stripped down example. and also disk i/o might be worse in kubernetes.
@cstickel thank you for the reproduction. I'm unlikely to have time to dig again until the weekend. When I have a chance to, I will try the recommendation of using stat
to bust the cache, see if we still get a signficant performance boost.
I use nest js and found this strange behavior, when webpackHotReload my "empty" application restarts for 4 seconds with the --enable-source-map
flag enabled, and switching to the source-map-support plugin becomes equal to 1 second. I measured this data more precisely by collecting the bundle of my application, together with the inline-source-map (but just source-map maps are no better), and this is what happened.
time node --require=source-map-support/register dist/server/server.dev.js
real 0m1.746s
user 0m1.408s
sys 0m0.085s
time node --enable-source-maps dist/server/server.dev.js
real 0m5.172s
user 0m4.591s
sys 0m0.366s
This makes me sad because I can't manage file paths in the source-map-support plugin, but it works very fast. And the file paths in native nodejs are fine, but waiting 4 seconds after each edit to restart my server is a pain.
I understand that this is not really about the performance of the application, however, suddenly it will be important here
@WeslyG do you feel comfortable building Node.js from source? perhaps you could test against this branch:
https://github.com/bcoe/node-1/commit/21489fd36dd7929c95f268487df12170c5711148
If this is biting people in the real world, let's fix it.
In terms of real world, we've just upgraded from Node 12 to 16 and have seen our start times leap up from approx 15s to at worst over 90s (seems to always 60s+).
Checking with --perf
led me to source map parsing being the issue and to this issue.
I'm not sure if I can add much more to the conversation than that however.
@ChrisWestcottUK are you running with --enable-source-maps
?
Yes, it drops back down to about 10-15s without --enable-source-maps
and up to 60s-90s with --enable-source-maps
.
We've jumped up a few versions in one go so wonder if it is a regression in this area.
I've just tested switching back to Node 14 LTS with the same problem. And then back to 12 again which has no such delays.
I'm also running into this. Switching my Webpack devtool to "nosources-cheap-source-map" helped a little bit since we just need line numbers and not source content, but the hit is still pretty big. The source map isn't unreasonably large, about 300kb.
@ChrisWestcottUK @laverdet if either of you feel comfortable building against a branch, try:
https://github.com/bcoe/node-1/commit/21489fd36dd7929c95f268487df12170c5711148
The major difference between 12 and 14 is that the original source file is loaded from disk, so that stack traces can be shown in context.
What's odd to me, is that it should be the original source being read from disk, not the bundled source so I'm a little surprised that there's such a performance hit -- this is why I'd like confirmation that the fix actually helps, in a real world environment, before adding the caching behavior.
Note: if you can't build from source, alternatively you can point me against a test case to run against, and I can profile.
Thanks. I didn't notice any difference in runtime when running with 21489fd36dd7929c95f268487df12170c5711148.
Our applications starts up in about 5 second with no flags, or 30 seconds with --enable-source-maps.
@laverdet or @ChrisWestcottUK do either of you have a reproduction I could work from for profiling?
I took some time this morning to look into the issue further and I think I know what's going on here.
You can fix the issue by removing these debug invocations:
diff --git a/lib/internal/source_map/prepare_stack_trace.js b/lib/internal/source_map/prepare_stack_trace.js
index 9502cfef6f..8f1438d243 100644
--- a/lib/internal/source_map/prepare_stack_trace.js
+++ b/lib/internal/source_map/prepare_stack_trace.js
@@ -106,7 +106,6 @@ const prepareStackTrace = (globalThis, error, trace) => {
}
}
} catch (err) {
- debug(err.stack);
}
return `${str}${t}`;
}), '');
diff --git a/lib/internal/source_map/source_map_cache.js b/lib/internal/source_map/source_map_cache.js
index c0de6aeb51..a2bb0ee9ef 100644
--- a/lib/internal/source_map/source_map_cache.js
+++ b/lib/internal/source_map/source_map_cache.js
@@ -79,7 +79,6 @@ function maybeCacheSourceMap(filename, content, cjsModuleInstance) {
} catch (err) {
// This is most likely an [eval]-wrapper, which is currently not
// supported.
- debug(err.stack);
return;
}
const match = StringPrototypeMatch(
@@ -119,7 +118,6 @@ function dataFromUrl(sourceURL, sourceMappingURL) {
return null;
}
} catch (err) {
- debug(err.stack);
// If no scheme is present, we assume we are dealing with a file path.
const mapURL = new URL(sourceMappingURL, sourceURL).href;
return sourceMapFromFile(mapURL);
@@ -144,7 +142,6 @@ function sourceMapFromFile(mapURL) {
const data = JSONParse(content);
return sourcesToAbsolute(mapURL, data);
} catch (err) {
- debug(err.stack);
return null;
}
}
@@ -163,7 +160,6 @@ function sourceMapFromDataUrl(sourceURL, url) {
const parsedData = JSONParse(decodedData);
return sourcesToAbsolute(sourceURL, parsedData);
} catch (err) {
- debug(err.stack);
return null;
}
} else {
The call in dataFromUrl
is particularly hot. Removing that one debug clears up most of the runtime slowdown for me.
If these logs are important, consider some check before accessing the err.stack
property. The problem is that you're eagerly generating formatted stack traces when they're not being used. This is actually pretty expensive even without a Error.prepareStackTrace
handler installed.
Consider:
const d = performance.now();
try {
throw new Error("wow");
} catch (err) {
}
console.log(performance.now() - d);
const d2 = performance.now();
try {
throw new Error("wow");
} catch (err) {
err.stack;
}
console.log(performance.now() - d2);
Logs: 0.03008270263671875 0.15012502670288086
This is actually a problem elsewhere in the nodejs codebase: https://github.com/nodejs/node/blob/6847fec38433a1dd16d9e3d0915e2b7fa32692c1/lib/internal/modules/esm/resolve.js#L1183 https://github.com/nodejs/node/blob/6847fec38433a1dd16d9e3d0915e2b7fa32692c1/lib/internal/modules/esm/module_job.js#L138
The best practice in these cases, as far as I know, is to install a lazy stack
getter on the thrown Error
object which will handle these expensive computations only when the user actually accesses the property. Generating this information any time the exception is allocated causes the kinds of slowdowns we're seeing here. I have a feeling that cleaning up these .stack accesses will actually improve runtime speeds for users not using --enable-source-maps since these "exceptional" cases are actually fairly common in many types of applications.
Edit: I used the lazy stack technique mentioned above in my [deprecated] node-fibers package: https://github.com/laverdet/node-fibers/blob/da2ae29ddbbabdc0d54f7abd7416fbee9826ee65/future.js#L211-L250
@laverdet if we switched to err.message
, that should also give a signficant performance improvement correct? and makes it a bit easier to debug when working in the codebase.
_Edit: also, should we open separate issues for const lines = StringPrototypeSplit(error.stack, '\n');?_
err.message
would be a lot better.
Feel free to open a new issue for the calls to err.stack
. I think putting them behind a lazy getter would be valuable but I don't have the bandwidth to personally do it.
Looks like you could also use the cb property of debuglog if this err.stack information is important for troubleshooting:
debuglog('source_map', debug => debug(err.stack))
In the case debugging is not enabled the callback will not be invoked.
@laverdet give #42096 a shot, if you have the cycles 😄, thanks for helping to dig into this.
c1275f8dcceedcba4320e4b8c5424ba7586437e7 provides a noticeable improvement to runtime w/ source maps enabled, in my case.
@laverdet the main branch of Node should hopefully now address the performance bottleneck you found 🎉
@ChrisWestcottUK I'm hopeful this will address your issue as well.
@cstickel I'm not expecting this addresses the use case you brought up, as the underlying problem with that one I believe is the re-reading of a large file when stack traces are created.
Hey there, I happen to have encountered a very similar issue (here's my story https://twitter.com/vvoyer/status/1498436054851981320).
While I was not able to read the complete issue comments I can provide you with a CPU trace that shows the highest consumers of my --enable-source-maps issue (decodeVLQ, parseMappingPayload, parseMap, peek..).
You can contact me via DM on Twitter (https://twitter.com/vvoyer) as I am not sure of the security implications from that profile.
I believe this issue is mostly seen by folks bundling code for use in AWS lambda for example. And it's very popular (from a few Google searches) to use --enable-source-maps with AWS Lambda.
We're using Node.js 14.17.6, can't test with a higher version though (AWS lambda limitation).
Thanks!
~FWIW We're using Datadog tracing and Sentry, which could also be responsible for creating this mess.~
Update: We have a const { stack } = new Error()
in our code, right before doing some HTTP request. And parsing the sourcemap takes more time than the actual HTTP request. We were using inline source maps.
@vvo if I'm understanding correctly, a hot path in the lambda has this logic?
const { stack } = new Error()
☝️ accessing the stack runs prepareStackTrace
, which will apply source maps if you have --enable-source-maps
enabled.
I think it would be worth adding to the documentation that .stack
can be an expensive operation, but it's not possible to have source map support and not have the additional CPU burden (especially if your source map is quite large).
For the more common use-case of accessing .stack
in an exceptional code path, you will have the hit, but get better error output -- it's a trade off.
@vvo one question for you, for the logic:
const { stack } = new Error()
☝️ if the stack is going to be the same on every request, couldn't you make the caching in user-land:
let cachedStack;
function main(req, res) {
if (!cachedStack) {
const { stack } = new Error()
cachedStack = stack;
}
}
I think we could get some performance improvements by caching the loading of original source from disk, which is a bottleneck we haven't taken care of yet. But I don't think there are significant performances gains we could get in decodeVLQ
, if the problem is that the lambda is CPU bound (which it looked like from your thread).
Hey @bcoe, as it turned out, we did not even needed the stack trace to be "prepared" (const { stack } = new Error()).
So all we had to do was switch to const error = new Error(), and if there's an error, we access the stack.
I think most developers (Node.js) do not understand that accessing error.stack does a lot of things possibly.
Hi All, I have experienced the same problems being discussed in a production setting. It's exciting to see effort towards improving source map performance. I put together a very simple demo repository that does a timing test for various combinations of node runtime, compiler, and native / 3rd party sourcemap support.
It's not the most scientific testing method - but I think it's good enough to get some sense of the performance impact you might experience in the wild.
@mjpowersjr I'd expect performance to be a bit slower, due to us loading the original source from disk, my also unscientific opinion is that the tsc
performance looks reasonable.
I'm curious about the vastly different performance characteristics of esbuild
, this warrants digging into (it seems like a bug.).
Could the esbuild performance be because esbuild creates a single bundle while tsc doesn't? So when parsing sourcemaps it's a lot easier faster to parse a small sourcemap than trying to parse a gigantic one?
Refs: conversation in https://github.com/nodejs/node/pull/43428
@legendecas I believe we solved one of the major sources of slowdown, which was that the .stack
property was being accessed when printing debug messages.
However, in some of my profiling, I was noticing that readFileSync
was a significant bottleneck as well.
A user in https://github.com/nodejs/node/issues/43186 suggests dropping this prepended context, as it would make error messages more compliant with an emergent specification (proposal-error-stacks) -- I feel it might be worth dropping the logic for prepending context:
However, should this be considered a breaking change, as some people might have written code that assumes the context is prepended?
However, should this be considered a breaking change, as some people might have written code that assumes the context is prepended?
It is a breaking change. However, as the source context is not prepended on the value of stack
when the source map is not enabled, as mentioned in the OP of https://github.com/nodejs/node/issues/43186, I wonder if there is anyone who would rely on it. I believe we can land such changes as a semver-major one.
Ideally, the behavior should be only printing the source context when the exception is caught by the runtime and there is no proper handler to process the exception, just like when the source map is not enabled.
I'll work on this to align the behavior on whether or not the source map is enabled, to improve the performance when people just like to get the value of the stack
property.
Found this issue and some tweets by @vvo after having this exact same problem on AWS Lambda.
If you're using esbuild and you only care about stack traces of your own code, you can bypass the generation of sourcemaps for everything inside node_modules with a simple trick like this one: https://github.com/evanw/esbuild/issues/1685#issuecomment-944916409. Mine went from a huge 24M to a mere 240K.
If you're not using sourcemaps to debug the bundled code, another good alternative is to turn off the sourcesContent
option.
Version
v16.13.2
Platform
5.13.0-22-generic #22-Ubuntu x86_64 x86_64 x86_64 GNU/Linux
Subsystem
No response
What steps will reproduce the bug?
I know
--enable-source-maps
is experimental and it's maybe an edge case, because the js files are only large enough the show the effect if they are bundled, which might in most cases not be the case. However, hopefully the bug report still helps to improve the feature.tsc --sourceMap index.ts
--enable-source-maps
, likenode --enable-source-maps index.js
=> The first requests will be quite fast, but then they get slower and slower.
node index.js
node -r source-map-support/register index.js
=> considering source-maps obviously has a performance impact, but it's neglectable with the source-map-support library, even in production for most cases (even if there are a lot of accesses to error stacks)
How often does it reproduce? Is there a required condition?
Always if the javascript file is large enough and error stacks are accessed quick enough.
What is the expected behavior?
Similar performance to source-map-support.
What do you see instead?
A heavy performance impact for using native source map support.
Additional information
--enable-source-maps
is experimental and not meant to be used in production environments, but still I think this needs to be fixed if it ever should be non experimental.Some data from artillery, just for everyone who is not reproducing it on it's own (of course this stats heavily depend on the machine they are executed on, but the tendency should be the same everywhere):
With
node --enable-source-maps index.js
(native source maps)=> as it already stacks up load for more than 60 sec or more requests per secound would lead to timeouts
With
node index.js
(no source maps)10 times more requests
With
node -r source-map-support/register index.js
(source-map-support)10 times more requests