nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.34k stars 29.48k forks source link

V8 assertion failure from `handle_checker.CheckGlobalAndEternalHandles()` while building a snapshot #48672

Open RaisinTen opened 1 year ago

RaisinTen commented 1 year ago

Version

v20.4.0

Platform

Darwin Darshans-MacBook-Pro.local 22.4.0 Darwin Kernel Version 22.4.0: Mon Mar 6 21:00:17 PST 2023; root:xnu-8796.101.5~3/RELEASE_X86_64 x86_64

Subsystem

snapshot

What steps will reproduce the bug?

Attempting to build a snapshot from this script causes a crash.

snapshot.js:

const net = require('node:net');
const v8 = require('node:v8');

let x;

const server = net.createServer((socket) => {
  socket.on('data', (data) => {
    x = data.toString();
    server.close();
  });

  socket.end('hello');
});

server.listen(() => {
  const client = net.createConnection({ port: server.address().port }, () => {
    client.end('world');
  });
});

if (v8.startupSnapshot.isBuildingSnapshot()) {
  v8.startupSnapshot.setDeserializeMainFunction(() => {
    console.log('I am from the snapshot')
    console.log(`x: "${x}"`)
  })
} else {
  setTimeout(() => {
    console.log(`x: "${x}"`);
  }, 100);
}

How often does it reproduce? Is there a required condition?

Always.

What is the expected behavior? Why is that the expected behavior?

No crashes.

What do you see instead?

A crash:

$ node --snapshot-blob snapshot.blob --build-snapshot snapshot.js
global handle not serialized: 0x13abd99b3f49: [[api object] 0] in OldSpace
 - map: 0x13ab9cf0b3f9 <Map[56](HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x13ab5b4363e1 <Object map = 0x13ab9cf0b581>
 - elements: 0x13ab29280259 <FixedArray[0]> [HOLEY_ELEMENTS]
 - embedder fields: 4
 - properties: 0x13ab3a89ffc9 <PropertyArray[3]>
 - All own properties (excluding elements): {
    0x13ab6bef8e81: [String] in OldSpace: #reading: 0x13ab29280659 <true> (data field 0), location: properties[0]
    0x13ab5b402919 <Symbol: owner_symbol>: 0x13ab3a8a00f1 <Socket map = 0x13ab9cf1c989> (data field 1), location: properties[1]
    0x13ab5b403879: [String] in OldSpace: #onconnection: 0x13ab29280269 <null> (data field 2), location: properties[2]
 }
 - embedder fields = {
    1, aligned pointer: 0x111ed8048
    32701, aligned pointer: 0x7fbdd2805cf0
    32701, aligned pointer: 0x7fbdd2805d48
    0x13ab3a88f8c1 <JSFunction onStreamRead (sfi = 0x13ab69148f69)>
 }

#
# Fatal error in , line 0
# Check failed: handle_checker.CheckGlobalAndEternalHandles().
#
#
#
#FailureMessage Object: 0x7ff7b24e5bc0
 1: 0x10db725d2 node::NodePlatform::GetStackTracePrinter()::$_3::__invoke() [/usr/local/bin/node]
 2: 0x10ee300c3 V8_Fatal(char const*, ...) [/usr/local/bin/node]
 3: 0x10dcb1202 v8::SnapshotCreator::CreateBlob(v8::SnapshotCreator::FunctionCodeHandling) [/usr/local/bin/node]
 4: 0x10db9e4b9 node::SnapshotBuilder::CreateSnapshot(node::SnapshotData*, node::CommonEnvironmentSetup*, unsigned char) [/usr/local/bin/node]
 5: 0x10db9dd67 node::SnapshotBuilder::Generate(node::SnapshotData*, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&, std::__1::optional<std::__1::basic_string_view<char, std::__1::char_traits<char>>>) [/usr/local/bin/node]
 6: 0x10dac8eb6 node::GenerateAndWriteSnapshotData(node::SnapshotData const**, node::InitializationResultImpl const*) [/usr/local/bin/node]
 7: 0x10dac9569 node::Start(int, char**) [/usr/local/bin/node]
 8: 0x7ff80cf0a41f start [/usr/lib/dyld]
[1]    3554 trace trap  node --snapshot-blob snapshot.blob --build-snapshot snapshot.js

It originates from this check in V8 https://github.com/nodejs/node/blob/b5e16adb1d155759e7db405eead5a43cd425785d/deps/v8/src/api/api.cc#L764-L766.

Additional information

Running the script normally just records the data returned by the client:

$ node snapshot.js
x: "world"

(I was just experimenting with storing the state of various kinds of asynchronous activities during the snapshot building phase, so that the results could be used later on in the setDeserializeMainFunction snapshot entry point)


cc @joyeecheung

joyeecheung commented 1 year ago

The server cannot be active when the snapshot is taken (e.g. it must stop listening before the snapshot is taken, and it can listen again after the snapshot is deserialized). For obvious reasons we cannot serialize system resources on the building machine and deserialize them on a potentially different machine.

joyeecheung commented 1 year ago

I think the example in the OP might have the intention to close the server before the snapshot is taken, but it actually doesn't get rid of the handles properly and there are some leaks (if you log process._getActiveHandles() at process exit, there are still two streams and a socket hanging around when the process shuts down)

RaisinTen commented 1 year ago

The server cannot be active when the snapshot is taken (e.g. it must stop listening before the snapshot is taken, and it can listen again after the snapshot is deserialized).

Is it possible to delay the snapshot generation step till after x is assigned a value / or after process._getActiveHandles() doesn't contain any of the relevant handles? The server should get closed at that point and it's not needed to start listening again after the snapshot is deserialized.

joyeecheung commented 1 year ago

I think there is a leak going on, so this can't be done until the actual leak is fixed. Running this with --max-old-space-size=10 and it will crash with OOM:

'use strict';

const net = require('node:net');
const v8 = require('node:v8');

let count = 0;
function run() {
  const server = net.createServer((socket) => {
    socket.on('data', (data) => {
      server.close();
    });
    socket.end('hello');
  }).listen(() => {
    const client = net.createConnection({ port: server.address().port }, () => {
      client.end('world');
    });
  }).on('close', () => {
    if (count++ < 100000) {
      setTimeout(run, 1);
    } else {
      console.log(v8.writeHeapSnapshot());
    }
  });
}
run();

If the counter is turned down a bit to e.g. 1000 so that it still has enough (well, JS) memory to spend, in the heap snapshot generated it shows that the socket is actually never reclaimed.