nodejs / help

:sparkles: Need help with Node.js? File an Issue here. :rocket:
1.47k stars 280 forks source link

Strings subsystem generate (hard to detect) memory-leakages. Garbage Collector update request #4437

Closed Informate closed 2 months ago

Informate commented 3 months ago

Version

v22.4.0

Platform

Linux 6.5.0-41-generic

Subsystem

Strings, Garbage Collector

What steps will reproduce the bug?

At OoM in some example just 1%-2% of the heap is really used, a 5%-10% waste could be usual. In the following at the edge example a memory waste of 99.9984% should be achieved.

$ node --max-old-space-size=6 urls-7.js
// urls-7.js

function getRandomBuffer(){
 const bufferSize = 1024 * 1024; // 1 MB
 const randomBuffer = Buffer.alloc(bufferSize);
 for (let i = 0; i < bufferSize; i++) {
  randomBuffer[i] = Math.floor(Math.random() * 256);
 }
 return randomBuffer;
}

let slices = [];
// while(true) {                             // Unclear could create miss-understanding
for (var i=0;i<500000;i++) {        // Clearer than while(true) - 0.5 TB
  let string = getRandomBuffer().toString('utf-8');
  slice = string.slice(50000,50016);
  slices.push(slice);
  __heaplog();
}

// Helper function to show memory leakages
function __heaplog(){
  let m=process.memoryUsage();
  console.log('\nHeap: '+((m.heapUsed/2**20)|0)+'/'+(m.heapTotal/2**20|0)+' Mb - RSS: '+(m.rss/2**20|0)+' Mb');
}

How often does it reproduce? Is there a required condition?

At OoM.

<--- Last few GCs --->

[5918:0x70ef000]     1805 ms: Mark-Compact 6.0 (8.1) -> 5.1 (9.9) MB, pooled: 0 MB, 3.53 / 0.00 ms  (average mu = 0.980, current mu = 0.994) task; scavenge might not succeed
[5918:0x70ef000]     1965 ms: Mark-Compact 5.9 (10.2) -> 5.4 (10.2) MB, pooled: 0 MB, 10.56 / 0.00 ms  (average mu = 0.968, current mu = 0.934) background allocation failure; GC in old space requested

<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

 1: 0xe21092 node::OOMErrorHandler(char const*, v8::OOMDetails const&) [nodejs]
 2: 0x12224f0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [nodejs]
 3: 0x12227c7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [nodejs]
 4: 0x1452305  [nodejs]
 5: 0x146bb79 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [nodejs]
 6: 0x13bca68 v8::internal::StackGuard::HandleInterrupts(v8::internal::StackGuard::InterruptLevel) [nodejs]
 7: 0x187ab97 v8::internal::Runtime_StackGuardWithGap(int, unsigned long*, v8::internal::Isolate*) [nodejs]
 8: 0x1f31576  [nodejs]

What is the expected behavior? Why is that the expected behavior?

Recover from the error or treat it before it occurs. Java and JavaScript are programming languages that manage memory and for which the programmer does not need to keep track of memory. Instead, in this situation, the programmer needs to keep track of memory and handle the situation manually.

What do you see instead?

Keeping the string resulting from RegExp operations generates a Memory Leakages. The RegExp extracts short parts from an http document retrived from https. The bug was already present some year ago (And never solved I suppose). Cleaning the string will solve the memory leakages (this will leave JSON related leakages):

clean_string = JSON.parse(JSON.stringify( memory_leaking_string ));

or better (without other leaksges):

clean_string = Buffer.from(memory_leaking_string, 'utf-8').toString('utf-8');

The problem is not solved by other simple operations as .slice(0) or .toString().

Additional information

The Strings subsystem waste memory and the garbage collector is not able to recover from the heap crash.

To solve the problem and recover the garbage collector could require memory maps.

Random Sampling operations could be possible in the standard GC scheduling and frequency could be increased on matching. Random Sampling with just 1 sample a time would introduce a very limited overhead on GC. The sampling test would require a full GC cycle on the old space. Exploiting a fractal partitioning algorithm ( 1° middle of the space, 2° middle of first half, 3° middle of second half, 4° middel of fisrt quarter, ...) could reduce random sampling overhead. When the sampling test fails, succesive iterations up to the completion of GC cycle could be used to propose a new candidate (continuing the partial test on the next candidate). With a 10% memory waste the 1 sample random sampling should match in approximately 11 full GC cycles.

Running the new GC task in a separate low-priority GC thread on the old heap space would require a limited ammount of resources. For example using 32 sample from Random Sampling or Fractal Partitioning with free space bounds checks in the task would require [ 32 sizeof(memory_address) 3 bytes = 768 ] less than 1kb, plus a stack as deep as the string hierarchy tree structure multiplied by a costand that could be as small as 2,3 or just 1 (inlineing the next strings to checks in an array).

RedYetiDev commented 3 months ago

Hi! Can you provide a minimally reproducible example?

Informate commented 3 months ago

My code is rather long and complex, but the solution baffles me. This is the second time I have had this problem (First time on a similar task some year ago, solved in the same way). The really terrible thing is that the solution solves the problem! (Otherwise I would think it was something else).

RedYetiDev commented 3 months ago

My code is rather long and complex, but the solution baffles me.

Can you find a minimal reproduction in that? It'd really help demonstrate the issue.

Informate commented 3 months ago

It comes and goes... It have to be related to the HTTP content given to the RegExp, I have no other explanations. Anyway the problem is present since years.

RedYetiDev commented 3 months ago

If you've resolved the issue, feel free to self-close.

Informate commented 3 months ago

No! I did not solved the issues... The problem is present since years, 2° time I found it on same task: string from RegExp with text from HTTP request.

The string generates a memory-leakages as it brings with it some shadowed data given by the RegExp module.

RedYetiDev commented 3 months ago

Okay, a mininal reproducible example is still important to help diagnose the issue.

climba03003 commented 3 months ago

I believe it is more related to v8 instead of Node.js and need to be reported to v8 in order to solve the problem.

string slicing often keep reference to the parent string instead of copying to new buffer. RegExp result should be string slicing.

Reference https://iliazeus.github.io/articles/js-string-optimizations-en/#sliced

Informate commented 3 months ago

Sorry, I do not know where to report it. Anyway in past I have already found and reported a bug in the PCRE RegExp engine under PHP. It could also be the RegExp engine. I have reported this here because I do not know what to do best, and because is the second time I found the same problem/bug.

Informate commented 3 months ago

Reference https://iliazeus.github.io/articles/js-string-optimizations-en/#sliced

Yes, it is exactly what this post is talking about (Chapter: Unintended side effects, or: how to scrub a string), but it ends up crushing the nodejs process instead of freeing the memory and reorganizing the strings in the heap.

RedYetiDev commented 3 months ago

@nodejs/v8 Is this a known limitation?

Informate commented 3 months ago

Maybe they could provide an official scrubString method better than in and out from JSON?

climba03003 commented 3 months ago

It is unlikely be fixed, you can track the problem in below link. If you are using RegExp.match, you should use RegExp.exec instead of RegExp.match to prevent another memory leak.

Reference https://issues.chromium.org/issues/41480525

Informate commented 3 months ago

I think that a method to finalize the string, a method in the garbage collector and an automatic invocation of the garbage collector method (when the memory allocation fails for OoM before the heap crashes and then probably the nodejs process is terminated) should be present in the nodejs environment.

As of ChatGPT, a solution could be to use a custom string finalizer function:

let newStr = Buffer.from(str, 'utf-8').toString('utf-8');

The garbage collector method could for example check for big holes of unused memory in the original string buffers starting from the longest one, and then possibly moving (collapsing) all the string references in the string composition tree hierarchy system. (A long task maybe bloking for some ms as old GC but only on request or at OoM).

Informate commented 3 months ago

Okay, a mininal reproducible example is still important to help diagnose the issue.

I posted the example as requested.

aduh95 commented 3 months ago

Note that Node.js 12.x is End-of-Life since 2022-04-30 (more than 2 years ago at the time of writing), meaning it's no longer supported and issues reported on that version won't be fixed. I would recommend updating to a supported version (e.g. Node.js 22.x or 20.x).

RedYetiDev commented 3 months ago

As @aduh95 mentioned, can you reproduce in v18, v20, or v22?


Additionally, could you strip your example to be minimal? Right now it has a lot of different steps, which could interfere with the problem at hand, can you make it minimal?

Informate commented 3 months ago

I have updated to v22.4.0!

RedYetiDev commented 3 months ago

Did that resolve the issue?

Informate commented 3 months ago

Did that resolve the issue?

No! It would need a GC update. But as it degrades performance it could be an optional GC feature.

RedYetiDev commented 3 months ago

Additionally, could you strip your example to be minimal? Right now it has a lot of different steps, which could interfere with the problem at hand, can you make it minimal?

Again, a minimally reproducible example is crucial when filing a bug report. Without it, there's no way to tell what exactly is causing a certain issue. Currently, your example relies on several functions, so it's hard to tell what's going wrong.

If you're unable to provide such examples, then I'm going to close this issue, as it's not reproducible.

climba03003 commented 3 months ago

I though minimal reproducible example in Node.js is referring to only use the Node.js components. How many steps or extracted function do not defect it is not a minimal reproducible example. Personally, I don't think eagerly close the issue with example would helps anything.

The root cause of issue is as simple as v8 string optimization is not good enough to strip the ram usage for referenced string view. It is not actionable in Node.js side but the issue itself still persist.

RedYetiDev commented 3 months ago

I believe a minimally reproducible example refers to an example leaving nothing up to chance, such as fetching arbitrary data (in this case Wikipedia).

Because this issue relies on RegExp, but http is used, it's hard to tell which part could cause the issue at hand.

I might be wrong tho.


Additionally, if it's not actionable on the Node.js side, than there's nothing we can do about it.

Informate commented 3 months ago

I believe a minimally reproducible example refers to an example leaving nothing up to chance, such as fetching arbitrary data (in this case Wikipedia).

Because this issue relies on RegExp, but http is used, it's hard to tell which part could cause the issue at hand.

I might be wrong tho.

Additionally, if it's not actionable on the Node.js side, than there's nothing we can do about it.

@RedYetiDev I got your point, but this is a minimal example not relying on chance.

I give you a better explanation of what is happening:

We need:

1 - something loading new data from buffers into strings: https get load new data into the buffers. 2 - something cutting the strings and keeping just a small part of the string. 3 - repeat until memory saturation.

I will give you a minimal example as soon as I can if this is too complex. But this is working and not on chanches.

Informate commented 3 months ago

@RedYetiDev I have provided the minimal example working for me as requested. I've tested it on node v12 and node v22. Both killed by the OoM error.

Let me know if now is it ok!

RedYetiDev commented 2 months ago

AFAICT there's nothing that can be done from the Node.js side, as this is controlled by V8, but I would like @nodejs/v8 to verify that information.

Informate commented 2 months ago

AFAICT there's nothing that can be done from the Node.js side, as this is controlled by V8, but I would like @nodejs/v8 to verify that information.

So where should we post the issue?

RedYetiDev commented 2 months ago

I'm not sure yet, let's wait for @nodejs/v8 to weigh in.

aduh95 commented 2 months ago

The code in the OP grows an array in an infinite loop, I would expect the process to eventually run out of memory – so if that what’s happening, I’d say it’s working as expected.

Informate commented 2 months ago

The code in the OP grows an array in an infinite loop, I would expect the process to eventually run out of memory – so if that what’s happening, I’d say it’s working as expected.

No. It pushes an object of 16 bytes and loose 1Mb of memory! That's the memory leakage! This is the reason you see the memory growing so fast in the example. At each cycle you will see an increase of at last 1mb of memory in the heap.

Even more: So you could run the cycle just 500'000 times on your machine?

RedYetiDev commented 2 months ago

You are losing 1MB of memory, as you are creating a 1MB buffer each iteration. This seems to be working as intended. To further show that I don't think there is any memory leakage, I have made a demo script:

function formatMemoryUsage(bytes) {
    return `${(bytes / 1024 / 1024).toFixed(2)} MB`;
}

function logMemoryUsage() {
    const usage = process.memoryUsage();
    console.log(`Memory Usage: Heap Total: ${formatMemoryUsage(usage.heapTotal)}, Heap Used: ${formatMemoryUsage(usage.heapUsed)}`);
}

// Simulate string slicing operations
function simulateStringSlicing() {
    let str = 'a'.repeat(10 * 1024 * 1024); // 10 MB string
    let slicedArray = [];

    // Simulate slicing
    for (let i = 0; i < 1000; i++) {
        let sliced = str.slice(i, i + Math.floor(Math.random() * 100));
        slicedArray.push(sliced);
    }
}

setInterval(() => {
    simulateStringSlicing();
    logMemoryUsage();
}, 1000);

After letting it run for some time, I saw no abnormal memory changes:

Memory Usage: Heap Total: 14.20 MB, Heap Used: 12.77 MB
Memory Usage: Heap Total: 15.70 MB, Heap Used: 13.16 MB
Memory Usage: Heap Total: 16.20 MB, Heap Used: 13.18 MB
Memory Usage: Heap Total: 15.20 MB, Heap Used: 13.16 MB
Memory Usage: Heap Total: 15.20 MB, Heap Used: 13.16 MB
Memory Usage: Heap Total: 15.20 MB, Heap Used: 13.17 MB
Memory Usage: Heap Total: 15.20 MB, Heap Used: 13.17 MB
Memory Usage: Heap Total: 15.20 MB, Heap Used: 13.18 MB
Memory Usage: Heap Total: 15.20 MB, Heap Used: 13.19 MB
Memory Usage: Heap Total: 15.20 MB, Heap Used: 13.25 MB
Memory Usage: Heap Total: 15.20 MB, Heap Used: 13.25 MB
Memory Usage: Heap Total: 14.45 MB, Heap Used: 13.07 MB
Memory Usage: Heap Total: 14.45 MB, Heap Used: 13.07 MB

If you want to try it in an isolated system, I've hosted the code on RunKit.

climba03003 commented 2 months ago

I don't think there is any memory leakage, I have made a demo script

Your demo shows unclear understanding of the problem. Try the below one, you can see the occupied memory never released which is the memory of the parent string, not the sliced string.

In expected situation, it should release the memory of the unwanted parts of parent string and only occupy the memory of sliced string. Because it keeps in references and never dispose.

function formatMemoryUsage(bytes) {
  return `${(bytes / 1024 / 1024).toFixed(2)} MB`;
}

function printMemoryUsage() {
  const usage = process.memoryUsage()
  console.log(
    `Memory Usage: RSS: %s, Heap Total: %s, Heap Used: %s, External: %s, ArrayBuffers: %s`,
    formatMemoryUsage(usage.rss),
    formatMemoryUsage(usage.heapTotal),
    formatMemoryUsage(usage.heapUsed),
    formatMemoryUsage(usage.external),
    formatMemoryUsage(usage.arrayBuffers),
  );
}

// we use an array to store sliced reference
const references = []

// Simulate string slicing operations
function simulateStringSlicing() {
  let str = 'a'.repeat(10 * 1024 * 1024); // 10 MB string

  // we take random small part of the string
  const start = Math.floor(Math.random() * 100)
  const end = Math.floor(Math.random() * 1000)
  references.push(str.slice(start, end))
}

// run for 10 times
printMemoryUsage()
for(let i = 0; i < 10; i++) {
  simulateStringSlicing()
}
printMemoryUsage()

// check for memory usage
setInterval(printMemoryUsage, 1000);
RedYetiDev commented 2 months ago

Sorry for the misunderstanding, I indeed didn't understand that part of the question.

If I understand correctly now, when slicing a string, the full string removed from memory so long as the sliced reference still exists.

I've hidden my comment to help not de-rail the conversation.

climba03003 commented 2 months ago

If I understand correctly now, when slicing a string, the full string removed from memory so long as the sliced reference still exists.

Yes, as long as you sliced a string. It keeps the reference of the original string and never remove the unwanted part from memory. Even the code shows no where to use it again. You may try to unreference the original string by str = null after the array push. It stills keep in memory.

RedYetiDev commented 2 months ago

Thanks. I apologize for my misunderstanding. The String class is inherited directly from V8. As @climba03003 mentioned earlier, this issue is tracked at https://issues.chromium.org/issues/41480525.

This issue has been previously posted at https://github.com/nodejs/node/issues/25846, and, at the end-of-the day, there's not much that can be done on the Node.js side of things. I suggest you comment on that issue with how this affects you, and try to get Google's V8 team to work on this issue.

This issue has existed for over a decade, and I'm not sure if a patch is anywhere soon.