ljharb / json-stable-stringify

MIT License
55 stars 11 forks source link

cpu/memory performance regression with 1.1.0 #9

Closed mdouglass closed 10 months ago

mdouglass commented 10 months ago

We upgraded our project which deals with large JSON files (~250MiB) to 1.1.0. Using 1.1.0 we are seeing two performance-related regressions depending on whether we have a max-old-space-size setting in place.

  1. Without max-old-space-size, time to run json-stable-stringify increases from ~4s to ~10s.
  2. With max-old-space-size, the process crashes with an OOM.

node.js v20.7.0 json-stable-stringify 1.0.2 vs 1.1.0

index.mjs

import jss from "json-stable-stringify";
import { readFile, writeFile } from "fs/promises";

async function main() {
  const raw = await readFile("./input.json", "utf8");
  const parsed = JSON.parse(raw);
  const startMS = Date.now();
  const pretty = jss(parsed, { space: 2 });
  const endMS = Date.now();
  await writeFile("output.json", pretty, "utf8");
  console.log(endMS - startMS, "ms");
}

main().catch((e) => console.error(e.stack));

test script

#!/bin/bash

npm install json-stable-stringify@1.0.2
node --max-old-space-size=2048 ./index.mjs
node --max-old-space-size=2048 ./index.mjs
node --max-old-space-size=2048 ./index.mjs
node ./index.mjs
node ./index.mjs
node ./index.mjs

npm install json-stable-stringify@1.1.0
node --max-old-space-size=2048 ./index.mjs
node --max-old-space-size=2048 ./index.mjs
node --max-old-space-size=2048 ./index.mjs
node ./index.mjs
node ./index.mjs
node ./index.mjs

output of running test.sh

✖ ./test.sh

removed 12 packages, changed 1 package, and audited 3 packages in 568ms

2 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities
4165 ms
4463 ms
4325 ms
4353 ms
4289 ms
4093 ms

added 12 packages, changed 1 package, and audited 15 packages in 847ms

9 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities

<--- Last few GCs --->

[450840:0x6c88a90]    17145 ms: Scavenge 2030.7 (2061.9) -> 2028.5 (2062.7) MB, 2.76 / 0.00 ms  (average mu = 0.163, current mu = 0.123) allocation failure; 
[450840:0x6c88a90]    17151 ms: Scavenge 2031.2 (2062.7) -> 2029.3 (2063.4) MB, 3.76 / 0.00 ms  (average mu = 0.163, current mu = 0.123) allocation failure; 
[450840:0x6c88a90]    17156 ms: Scavenge 2032.2 (2063.4) -> 2030.2 (2072.2) MB, 4.04 / 0.00 ms  (average mu = 0.163, current mu = 0.123) allocation failure; 

<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0xc9b0c0 node::Abort() [node]
 2: 0xb700c9  [node]
 3: 0xec0260 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node]
 4: 0xec0547 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node]
 5: 0x10d1bd5  [node]
 6: 0x10d2164 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [node]
 7: 0x10e9054 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*) [node]
 8: 0x10e986c v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 9: 0x10eb9ca v8::internal::Heap::HandleGCRequest() [node]
10: 0x1057117 v8::internal::StackGuard::HandleInterrupts() [node]
11: 0x1212fb4 v8::internal::JsonStringifier::Result v8::internal::JsonStringifier::Serialize_<false>(v8::internal::Handle<v8::internal::Object>, bool, v8::internal::Handle<v8::internal::Object>) [node]
12: 0x1214ee6 v8::internal::JsonStringify(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>) [node]
13: 0xf522d2 v8::internal::Builtin_JsonStringify(int, unsigned long*, v8::internal::Isolate*) [node]
14: 0x1932df6  [node]
./test.sh: line 14: 450840 Aborted                 (core dumped) node --max-old-space-size=2048 ./index.mjs

<--- Last few GCs --->

[451319:0x6c4ba90]    17372 ms: Scavenge 2031.8 (2062.9) -> 2029.6 (2063.9) MB, 3.12 / 0.00 ms  (average mu = 0.158, current mu = 0.122) allocation failure; 
[451319:0x6c4ba90]    17377 ms: Scavenge 2032.3 (2063.9) -> 2030.4 (2064.9) MB, 3.66 / 0.00 ms  (average mu = 0.158, current mu = 0.122) allocation failure; 
[451319:0x6c4ba90]    17382 ms: Scavenge 2033.4 (2064.9) -> 2031.4 (2073.9) MB, 3.09 / 0.00 ms  (average mu = 0.158, current mu = 0.122) allocation failure; 

<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0xc9b0c0 node::Abort() [node]
 2: 0xb700c9  [node]
 3: 0xec0260 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node]
 4: 0xec0547 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node]
 5: 0x10d1bd5  [node]
 6: 0x10d2164 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [node]
 7: 0x10e9054 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*) [node]
 8: 0x10e986c v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 9: 0x10eb9ca v8::internal::Heap::HandleGCRequest() [node]
10: 0x1057117 v8::internal::StackGuard::HandleInterrupts() [node]
11: 0x1212fb4 v8::internal::JsonStringifier::Result v8::internal::JsonStringifier::Serialize_<false>(v8::internal::Handle<v8::internal::Object>, bool, v8::internal::Handle<v8::internal::Object>) [node]
12: 0x1214ee6 v8::internal::JsonStringify(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>) [node]
13: 0xf522d2 v8::internal::Builtin_JsonStringify(int, unsigned long*, v8::internal::Isolate*) [node]
14: 0x1932df6  [node]
./test.sh: line 15: 451319 Aborted                 (core dumped) node --max-old-space-size=2048 ./index.mjs

<--- Last few GCs --->

[452207:0x7462a90]    16704 ms: Scavenge 2031.7 (2062.7) -> 2029.5 (2063.9) MB, 3.23 / 0.00 ms  (average mu = 0.163, current mu = 0.124) allocation failure; 
[452207:0x7462a90]    16710 ms: Scavenge 2032.1 (2063.9) -> 2030.2 (2064.7) MB, 4.52 / 0.00 ms  (average mu = 0.163, current mu = 0.124) allocation failure; 
[452207:0x7462a90]    16716 ms: Scavenge 2033.2 (2064.7) -> 2031.2 (2073.7) MB, 4.03 / 0.00 ms  (average mu = 0.163, current mu = 0.124) allocation failure; 

<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0xc9b0c0 node::Abort() [node]
 2: 0xb700c9  [node]
 3: 0xec0260 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node]
 4: 0xec0547 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [node]
 5: 0x10d1bd5  [node]
 6: 0x10d2164 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [node]
 7: 0x10e9054 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*) [node]
 8: 0x10e986c v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 9: 0x10eb9ca v8::internal::Heap::HandleGCRequest() [node]
10: 0x1057117 v8::internal::StackGuard::HandleInterrupts() [node]
11: 0x14f86aa v8::internal::Runtime_StackGuard(int, unsigned long*, v8::internal::Isolate*) [node]
12: 0x7f733fed9ef6 
./test.sh: line 16: 452207 Aborted                 (core dumped) node --max-old-space-size=2048 ./index.mjs
9991 ms
10502 ms
11141 ms

The JSON input file I used for this run is 267,437,008 bytes. I am not sure sure if you'll need it or if any sample file in this size range would be able to demonstrate the problem.

mdouglass commented 10 months ago

I was able to run git bisect between 1.0.2 and HEAD of main using a modified version of my test.sh. The first failed commit is this one:

6c066b82708eb7e7ca0ca7f89737df48aa534a6c is the first bad commit
commit 6c066b82708eb7e7ca0ca7f89737df48aa534a6c
Author: Jordan Harband <ljharb@gmail.com>
Date:   Fri Nov 10 22:10:43 2023 -0800

    [Refactor] build up a string instead of an array + join

 index.js | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)
bisect found first bad commit

At a first glance that makes sense to me -- the creation of lots of temporary strings as it builds up the final JSON output vs the single-pass. For our use case, I'd lean towards reverting that, but I'm not sure what the motivator for the change was.

ljharb commented 10 months ago

Interesting, that makes sense but I wouldn't have thought of it.

I assume you can confirm that v1.1.0, with that commit reverted, fixes the problem? If so, I'll put up a PR (that's not just a straight revert), and you can try that one before I merge and release it.

mdouglass commented 10 months ago

Yep, that seems to work.

Version w/o --max-old-space-size --max-old-space-size=2048
1.0.2 4237, 4419, 4328 ms 4363, 4429, 4414 ms
1.1.0 10456, 10685, 11513 ms OOM, OOM, OOM
1.1.0 + revert 3809, 3927, 3769 ms 3884, 3752, 3885 ms
ljharb commented 10 months ago

awesome, even faster than v1.0.2 :-) I'll get a fix for this out tonight.

mdouglass commented 10 months ago

ty!

ljharb commented 10 months ago

@mdouglass if you're able to test out #10, just to make absolutely sure it'll solve the problem, then I can merge and release it :-)

mdouglass commented 10 months ago

lgtm, thank you again :+1:

Version w/o --max-old-space-size --max-old-space-size=2048
1.0.2 4237, 4419, 4328 ms 4363, 4429, 4414 ms
1.1.0 10456, 10685, 11513 ms OOM, OOM, OOM
1.1.0 + revert 3809, 3927, 3769 ms 3884, 3752, 3885 ms
#10 3851, 3934, 3813 ms 3884, 3785, 3686