v8 / v8.dev

The source code of v8.dev, the official website of the V8 project.
https://v8.dev/
Apache License 2.0
955 stars 322 forks source link

Strings subsystem generates (hard to detect) memory-leakages. Garbage Collector update request #774

Open Informate opened 4 months ago

Informate commented 4 months ago

Version

v22.4.0 and previous

Platform

Linux 6.5.0-41-generic

Subsystem

Strings, Garbage Collector

What steps will reproduce the bug?

At OoM in some example just 1%-2% of the heap is really used, a 5%-10% waste could be usual. In the following at the edge example a memory waste of 99.9984% should be achieved.

$ node --max-old-space-size=6 urls-7.js
// urls-7.js

function getRandomBuffer(){
 const bufferSize = 1024 * 1024; // 1 MB
 const randomBuffer = Buffer.alloc(bufferSize);
 for (let i = 0; i < bufferSize; i++) {
  randomBuffer[i] = Math.floor(Math.random() * 256);
 }
 return randomBuffer;
}

let slices = [];
while (true) {
  let string = getRandomBuffer().toString('utf-8');
  slice = string.slice(50000,50016);
  slices.push(slice);
  __heaplog();
}

// Helper function to show memory leakages
function __heaplog(){
  let m=process.memoryUsage();
  console.log('\nHeap: '+((m.heapUsed/2**20)|0)+'/'+(m.heapTotal/2**20|0)+' Mb - RSS: '+(m.rss/2**20|0)+' Mb');
}

How often does it reproduce? Is there a required condition?

At OoM.

<--- Last few GCs --->

[5918:0x70ef000]     1805 ms: Mark-Compact 6.0 (8.1) -> 5.1 (9.9) MB, pooled: 0 MB, 3.53 / 0.00 ms  (average mu = 0.980, current mu = 0.994) task; scavenge might not succeed
[5918:0x70ef000]     1965 ms: Mark-Compact 5.9 (10.2) -> 5.4 (10.2) MB, pooled: 0 MB, 10.56 / 0.00 ms  (average mu = 0.968, current mu = 0.934) background allocation failure; GC in old space requested

<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

 1: 0xe21092 node::OOMErrorHandler(char const*, v8::OOMDetails const&) [nodejs]
 2: 0x12224f0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [nodejs]
 3: 0x12227c7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [nodejs]
 4: 0x1452305  [nodejs]
 5: 0x146bb79 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [nodejs]
 6: 0x13bca68 v8::internal::StackGuard::HandleInterrupts(v8::internal::StackGuard::InterruptLevel) [nodejs]
 7: 0x187ab97 v8::internal::Runtime_StackGuardWithGap(int, unsigned long*, v8::internal::Isolate*) [nodejs]
 8: 0x1f31576  [nodejs]

What is the expected behavior? Why is that the expected behavior?

Recover from the error or treat it before it occurs. Java and JavaScript are programming languages that manage memory and for which the programmer does not need to keep track of memory. Instead, in this situation, the programmer needs to keep track of memory and handle the situation manually.

What do you see instead?

Keeping the string resulting from RegExp operations generates a Memory Leakages. The RegExp extracts short parts from an http document retrived from https. The bug was already present some year ago (And never solved I suppose). Cleaning the string will solve the memory leakages (this will leave JSON related leakages):

clean_string = JSON.parse(JSON.stringify( memory_leaking_string ));

or better (without other leaksges):

clean_string = Buffer.from(memory_leaking_string, 'utf-8').toString('utf-8');

The problem is not solved by other simple operations as .slice(0) or .toString().

Additional information

The Strings subsystem waste memory and the garbage collector is not able to recover from the heap crash.

To solve the problem and recover the garbage collector could require memory maps.

Random Sampling operations could be possible in the standard GC scheduling and frequency could be increased on matching. Random Sampling with just 1 sample a time would introduce a very limited overhead on GC. The sampling test would require a full GC cycle on the old space. Exploiting a fractal partitioning algorithm ( 1° middle of the space, 2° middle of first half, 3° middle of second half, 4° middel of fisrt quarter, ...) could reduce random sampling overhead. When the sampling test fails, succesive iterations up to the completion of GC cycle could be used to propose a new candidate (continuing the partial test on the next candidate). With a 10% memory waste the 1 sample random sampling should match in approximately 11 full GC cycles.

Running the new GC task in a separate low-priority GC thread on the old heap space would require a limited ammount of resources. For example using 32 sample from Random Sampling or Fractal Partitioning with free space bounds checks in the task would require [ 32 sizeof(memory_address) 3 bytes = 768 ] less than 1kb, plus a stack as deep as the string hierarchy tree structure multiplied by a costand that could be as small as 2,3 or just 1 (inlineing the next strings to checks in an array).