nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.31k stars 29.46k forks source link

Tracking issue: support of builtins and V8 issues in run-time user-land snapshots #44014

Open joyeecheung opened 2 years ago

joyeecheung commented 2 years ago

This issue is opened to track known bugs and limitations of the run-time user-land snapshot.

Currently known limitations/bugs in Node.js builtins:

Currently known limitations/bugs that are being fixed in the V8 upstream:

goloveychuk commented 2 years ago

Hey, big thanks for work. I've checkout https://github.com/nodejs/node/pull/38905 and applied https://bugs.chromium.org/p/v8/issues/detail?id=12921 patch on top of it. Generally it works good, but I found another segfault. It seems like it's related to GC. I've reproduced it two times. 1) 50 mb bundle, snapshot created, restored, segfault on running function which fs.readFileSync(this50mbFile, 'utf8') 2) 150 mb bundle, snapshot created, segfault on restoration (running node --snapshot-blob).

LLDB logs

``` ➜ snapshot git:(master) ✗ lldb -- /home/badim/github/node/out/Debug/node --snapshot-blob snapshot.blob script.js (lldb) target create "/home/badim/github/node/out/Debug/node" runCurrent executable set to '/home/badim/github/node/out/Debug/node' (x86_64). (lldb) settings set -- target.run-args "--snapshot-blob" "snapshot.blob" "script.js" (lldb) run Process 59768 launched: '/home/badim/github/node/out/Debug/node' (x86_64) Process 59768 stopped * thread #6, name = 'node', stop reason = signal SIGSEGV: invalid address (fault address: 0x559b20665e50) frame #0: 0x0000555556e59c40 node`v8::internal::JSArrayBuffer::MarkExtension() [inlined] std::__atomic_base::store(__m=memory_order_relaxed, __i=true, this=0x0000559b20665e50) at atomic_base.h:404:18 401 __glibcxx_assert(__b != memory_order_acq_rel); 402 __glibcxx_assert(__b != memory_order_consume); 403 -> 404 __atomic_store_n(&_M_i, __i, int(__m)); 405 } 406 407 _GLIBCXX_ALWAYS_INLINE void (lldb) bt * thread #6, name = 'node', stop reason = signal SIGSEGV: invalid address (fault address: 0x559b20665e50) * frame #0: 0x0000555556e59c40 node`v8::internal::JSArrayBuffer::MarkExtension() [inlined] std::__atomic_base::store(__m=memory_order_relaxed, __i=true, this=0x0000559b20665e50) at atomic_base.h:404:18 frame #1: 0x0000555556e59c40 node`v8::internal::JSArrayBuffer::MarkExtension() [inlined] std::atomic::store(__m=memory_order_relaxed, __i=true, this=0x0000559b20665e50) at atomic:104 frame #2: 0x0000555556e59c40 node`v8::internal::JSArrayBuffer::MarkExtension() [inlined] v8::internal::ArrayBufferExtension::Mark(this=0x0000559b20665e50) at js-array-buffer.h:182 frame #3: 0x0000555556e59c40 node`v8::internal::JSArrayBuffer::MarkExtension(this=0x00007ffff5a9d930) at js-array-buffer.cc:202 frame #4: 0x0000555556b22b17 node`v8::internal::HeapVisitor::Visit(v8::internal::Map, v8::internal::HeapObject) [inlined] v8::internal::MarkingVisitorBase::VisitJSArrayBuffer(object=JSArrayBuffer @ 0x00007ffff5a9d930, map=, this=0x00007ffff5a9dc40) at marking-visitor-inl.h:321:23 frame #5: 0x0000555556b22b08 node`v8::internal::HeapVisitor::Visit(this=0x00007ffff5a9dc40, map=Map @ rbp, object=) at objects-visiting-inl.h:64 frame #6: 0x0000555556b2519a node`v8::internal::ConcurrentMarking::Run(this=0x000055555b85e560, delegate=, code_flush_mode=, mark_compact_epoch=, should_keep_ages_unchanged=) at concurrent-marking.cc:536:46 frame #7: 0x0000555556b25cf1 node`v8::internal::ConcurrentMarking::JobTask::Run(this=0x000055555edad1c0, delegate=0x00007ffff5a9ed70) at concurrent-marking.cc:425:31 frame #8: 0x0000555557b974e8 node`v8::platform::DefaultJobWorker::Run(this=0x000055555edad250) at default-job.h:147:21 frame #9: 0x0000555556616f22 node`node::(anonymous namespace)::PlatformWorkerThread(data=0x000055555b4394f0) at node_platform.cc:43:14 frame #10: 0x00007ffff7c76ea7 libpthread.so.0`start_thread + 215 frame #11: 0x00007ffff7ba6def libc.so.6`__clone + 63 (lldb) ```

It looks like it starting GC snapshotted objects and there is a null pointer or smth. --max-old-space-size=12000 --max-semi-space-size=10000 didn't helped. Maybe I need to compile node with GC disabled to check if it's 100% GC. Ping me If I can help with debugging.

joyeecheung commented 2 years ago

@goloveychuk Thanks for the report, this is just a guess but can you try it with https://chromium-review.googlesource.com/c/v8/v8/+/3793525 ? If it shows up again, can you try building Node.js with the --v8-non-optimized-debug configure option and see if you can have a more detailed stack trace?

goloveychuk commented 2 years ago

applied patch, built, run, segfault. Invalid v8heapconst could not cause this bug, right? Because I cannot generate it... :)

goloveychuk commented 2 years ago

@joyeecheung took 30 mins to generate snapshot :)

lldb for --v8-non-optimized-debug

``` ➜ snapshot git:(master) ✗ lldb -- /home/badim/github/node/out/Debug/node --snapshot-blob snapshot.blob script.js (lldb) target create "/home/badim/github/node/out/Debug/node" Current executable set to '/home/badim/github/node/out/Debug/node' (x86_64). (lldb) settings set -- target.run-args "--snapshot-blob" "snapshot.blob" "script.js" (lldb) run Process 61239 launched: '/home/badim/github/node/out/Debug/node' (x86_64) Process 61239 stopped * thread #1, name = 'node', stop reason = signal SIGSEGV: invalid address (fault address: 0x55c339d43a70) frame #0: 0x000055555791a3d5 node`std::atomic::store(bool, std::memory_order) [inlined] std::__atomic_base::store(__m=memory_order_relaxed, __i=true, this=0x000055c339d43a70) at atomic_base.h:404:18 401 __glibcxx_assert(__b != memory_order_acq_rel); 402 __glibcxx_assert(__b != memory_order_consume); 403 -> 404 __atomic_store_n(&_M_i, __i, int(__m)); 405 } 406 407 _GLIBCXX_ALWAYS_INLINE void (lldb) bt * thread #1, name = 'node', stop reason = signal SIGSEGV: invalid address (fault address: 0x55c339d43a70) * frame #0: 0x000055555791a3d5 node`std::atomic::store(bool, std::memory_order) [inlined] std::__atomic_base::store(__m=memory_order_relaxed, __i=true, this=0x000055c339d43a70) at atomic_base.h:404:18 frame #1: 0x000055555791a3bb node`std::atomic::store(this=0x000055c339d43a70, __i=true, __m=memory_order_relaxed) at atomic:104 frame #2: 0x000055555827c678 node`v8::internal::ArrayBufferExtension::Mark(this=0x000055c339d43a70) at js-array-buffer.h:182:30 frame #3: 0x000055555857140d node`v8::internal::JSArrayBuffer::MarkExtension(this=0x00007fffffffbb88) at js-array-buffer.cc:202:20 frame #4: 0x0000555558246538 node`v8::internal::MarkingVisitorBase, v8::internal::MajorMarkingState>::VisitJSArrayBuffer(this=0x0000555560e258d0, map=Map @ 0x00007fffffffbb90, object=JSArrayBuffer @ 0x00007fffffffbb88) at marking-visitor-inl.h:321:23 frame #5: 0x000055555823a284 node`v8::internal::HeapVisitor >::Visit(this=0x0000555560e258d0, map=Map @ 0x00007fffffffbbc0, object=HeapObject @ 0x00007fffffffbbb8) at objects-visiting-inl.h:64:5 frame #6: 0x0000555558223cd1 node`std::pair v8::internal::MarkCompactCollector::ProcessMarkingWorklist<(this=0x000055555dadbd40, bytes_to_process=65536)0>(unsigned long) at mark-compact.cc:2256:50 frame #7: 0x0000555558208ab9 node`v8::internal::IncrementalMarking::Step(this=0x000055555d623ab0, max_step_size_in_ms=5, action=GC_VIA_STACK_GUARD, step_origin=kV8) at incremental-marking.cc:928:60 frame #8: 0x0000555558208849 node`v8::internal::IncrementalMarking::AdvanceOnAllocation(this=0x000055555d623ab0) at incremental-marking.cc:878:7 frame #9: 0x000055555820479b node`v8::internal::IncrementalMarking::Observer::Step(this=0x000055555d623b70, bytes_allocated=262128, addr=47321084409928, size=40) at incremental-marking.cc:49:44 frame #10: 0x00005555581063e9 node`v8::internal::AllocationCounter::InvokeAllocationObservers(this=0x000055555d61c4a8, soon_object=47321084409928, object_size=40, aligned_object_size=40) at allocation-observer.cc:104:28 frame #11: 0x00005555582e7014 node`v8::internal::SpaceWithLinearArea::InvokeAllocationObservers(this=0x000055555d61c480, soon_object=47321084409928, size_in_bytes=40, aligned_size_in_bytes=40, allocation_size=40) at spaces.cc:431:50 frame #12: 0x000055555816c264 node`v8::internal::SpaceWithLinearArea::AllocateRawUnaligned(this=0x000055555d61c480, size_in_bytes=40, origin=kRuntime) at spaces-inl.h:272:28 frame #13: 0x000055555816c29d node`v8::internal::SpaceWithLinearArea::AllocateRawSlow(this=0x000055555d61c480, size_in_bytes=40, alignment=kTaggedAligned, origin=kRuntime) at spaces-inl.h:307:55 frame #14: 0x000055555816c0a4 node`v8::internal::SpaceWithLinearArea::AllocateRaw(this=0x000055555d61c480, size_in_bytes=40, alignment=kTaggedAligned, origin=kRuntime) at spaces-inl.h:253:79 frame #15: 0x000055555818a83e node`v8::internal::AllocationResult v8::internal::HeapAllocator::AllocateRaw<(v8::internal::AllocationType)1>(this=0x000055555da70da0, size_in_bytes=40, origin=kRuntime, alignment=kTaggedAligned) at heap-allocator-inl.h:103:39 frame #16: 0x000055555818ba25 node`v8::internal::HeapObject v8::internal::HeapAllocator::AllocateRawWith<(v8::internal::HeapAllocator::AllocationRetryMode)1>(this=0x000055555da70da0, size=40, allocation=kOld, origin=kRuntime, alignment=kTaggedAligned) at heap-allocator-inl.h:224:47 frame #17: 0x000055555894d4b8 node`v8::internal::Heap::AllocateRawOrFail(this=0x000055555da70d48, size=40, allocation=kOld, origin=kRuntime, alignment=kTaggedAligned) at heap-inl.h:211:62 frame #18: 0x0000555558950fde node`v8::internal::Deserializer::Allocate(this=0x00007fffffffd5b0, allocation=kOld, size=40, alignment=kTaggedAligned) at deserializer.cc:1317:43 frame #19: 0x000055555894fb6b node`v8::internal::Deserializer::ReadObject(this=0x00007fffffffd5b0, space=kOld) at deserializer.cc:670:15 frame #20: 0x00005555589570dc node`int v8::internal::Deserializer::ReadSingleBytecodeData(this=0x00007fffffffd5b0, data='\x01', slot_accessor=SlotAccessorForHeapObject @ 0x00007fffffffc3f0) at deserializer.cc:938:50 frame #21: 0x000055555894f827 node`v8::internal::Deserializer::ReadData(this=0x00007fffffffd5b0, object=Handle @ 0x00007fffffffc700, start_slot_index=1, end_slot_index=7) at deserializer.cc:908:38 frame #22: 0x000055555894ff4a node`v8::internal::Deserializer::ReadObject(this=0x00007fffffffd5b0, space=kOld) at deserializer.cc:709:11 frame #23: 0x00005555589570dc node`int v8::internal::Deserializer::ReadSingleBytecodeData(this=0x00007fffffffd5b0, data='\x01', slot_accessor=SlotAccessorForHeapObject @ 0x00007fffffffc820) at deserializer.cc:938:50 frame #24: 0x000055555894f827 node`v8::internal::Deserializer::ReadData(this=0x00007fffffffd5b0, object=Handle @ 0x00007fffffffcb30, start_slot_index=1, end_slot_index=3) at deserializer.cc:908:38 frame #25: 0x000055555894ff4a node`v8::internal::Deserializer::ReadObject(this=0x00007fffffffd5b0, space=kOld) at deserializer.cc:709:11 frame #26: 0x00005555589570dc node`int v8::internal::Deserializer::ReadSingleBytecodeData(this=0x00007fffffffd5b0, data='\x01', slot_accessor=SlotAccessorForHeapObject @ 0x00007fffffffcc50) at deserializer.cc:938:50 frame #27: 0x000055555894f827 node`v8::internal::Deserializer::ReadData(this=0x00007fffffffd5b0, object=Handle @ 0x00007fffffffcf60, start_slot_index=1, end_slot_index=6151) at deserializer.cc:908:38 frame #28: 0x000055555894ff4a node`v8::internal::Deserializer::ReadObject(this=0x00007fffffffd5b0, space=kOld) at deserializer.cc:709:11 frame #29: 0x00005555589570dc node`int v8::internal::Deserializer::ReadSingleBytecodeData(this=0x00007fffffffd5b0, data='\x01', slot_accessor=SlotAccessorForHeapObject @ 0x00007fffffffd080) at deserializer.cc:938:50 frame #30: 0x000055555894f827 node`v8::internal::Deserializer::ReadData(this=0x00007fffffffd5b0, object=Handle @ 0x00007fffffffd390, start_slot_index=1, end_slot_index=3) at deserializer.cc:908:38 frame #31: 0x000055555894ff4a node`v8::internal::Deserializer::ReadObject(this=0x00007fffffffd5b0, space=kOld) at deserializer.cc:709:11 frame #32: 0x000055555894eebc node`v8::internal::Deserializer::DeserializeDeferredObjects(this=0x00007fffffffd5b0) at deserializer.cc:306:15 frame #33: 0x000055555894a0e5 node`v8::internal::ContextDeserializer::Deserialize(this=0x00007fffffffd5b0, isolate=0x000055555da64720, global_proxy=Handle @ 0x00007fffffffd4f8, embedder_fields_deserializer=v8::DeserializeEmbedderFieldsCallback @ 0x00007fffffffd4e0) at context-deserializer.cc:47:31 frame #34: 0x0000555558949f3c node`v8::internal::ContextDeserializer::DeserializeContext(isolate=0x000055555da64720, data=0x00007fffffffd7d0, can_rehash=true, global_proxy=Handle @ 0x00007fffffffd580, embedder_fields_deserializer=v8::DeserializeEmbedderFieldsCallback @ 0x00007fffffffd570) at context-deserializer.cc:24:72 frame #35: 0x0000555558987c89 node`v8::internal::Snapshot::NewContextFromSnapshot(isolate=0x000055555da64720, global_proxy=Handle @ 0x00007fffffffd7b0, context_index=2, embedder_fields_deserializer=v8::DeserializeEmbedderFieldsCallback @ 0x00007fffffffd790) at snapshot.cc:212:35 frame #36: 0x000055555834d5af node`v8::internal::Genesis::Genesis(this=0x00007fffffffdad0, isolate=0x000055555da64720, maybe_global_proxy=MaybeHandle @ 0x00007fffffffd8c8, global_proxy_template=(val_ = 0x0000000000000000), context_snapshot_index=2, embedder_fields_deserializer=v8::DeserializeEmbedderFieldsCallback @ 0x00007fffffffda90, microtask_queue=0x0000000000000000) at bootstrapper.cc:6331:70 frame #37: 0x00005555583239ea node`v8::internal::Bootstrapper::CreateEnvironment(this=0x000055555da62540, maybe_global_proxy=MaybeHandle @ 0x00007fffffffdac0, global_proxy_template=(val_ = 0x0000000000000000), extensions=0x00007fffffffdd90, context_snapshot_index=2, embedder_fields_deserializer=v8::DeserializeEmbedderFieldsCallback @ 0x00007fffffffdb50, microtask_queue=0x0000000000000000) at bootstrapper.cc:332:36 frame #38: 0x0000555557d7a925 node`v8::InvokeBootstrapper::Invoke(this=0x00007fffffffdbff, isolate=0x000055555da64720, maybe_global_proxy=MaybeHandle @ 0x00007fffffffdb78, global_proxy_template=(val_ = 0x0000000000000000), extensions=0x00007fffffffdd90, context_snapshot_index=2, embedder_fields_deserializer=DeserializeInternalFieldsCallback @ 0x00007fffffffdba0, microtask_queue=0x0000000000000000) at api.cc:6240:78 frame #39: 0x0000555557da3cff node`v8::CreateEnvironment(isolate=0x000055555da64720, extensions=0x00007fffffffdd90, maybe_global_template=(val_ = 0x0000000000000000), maybe_global_proxy=(val_ = 0x0000000000000000), context_snapshot_index=2, embedder_fields_deserializer=DeserializeInternalFieldsCallback @ 0x00007fffffffdd40, microtask_queue=0x0000000000000000) at api.cc:6340:27 frame #40: 0x0000555557d7aa85 node`v8::NewContext(external_isolate=0x000055555da64720, extensions=0x00007fffffffdd90, global_template=(val_ = 0x0000000000000000), global_object=(val_ = 0x0000000000000000), context_snapshot_index=2, embedder_fields_deserializer=DeserializeInternalFieldsCallback @ 0x00007fffffffddf0, microtask_queue=0x0000000000000000) at api.cc:6383:76 frame #41: 0x0000555557d7abd9 node`v8::Context::FromSnapshot(external_isolate=0x000055555da64720, context_snapshot_index=1, embedder_fields_deserializer=DeserializeInternalFieldsCallback @ 0x00007fffffffde10, extensions=0x0000000000000000, global_object=(val_ = 0x0000000000000000), microtask_queue=0x0000000000000000) at api.cc:6415:66 frame #42: 0x0000555557adbbf9 node`node::NodeMainInstance::CreateMainEnvironment(this=0x00007fffffffe070, exit_code=0x00007fffffffdfdc) at node_main_instance.cc:188:79 frame #43: 0x0000555557adb824 node`node::NodeMainInstance::Run(this=0x00007fffffffe070) at node_main_instance.cc:128:39 frame #44: 0x0000555557a0f36c node`node::LoadSnapshotDataAndRun(snapshot_data_ptr=0x00007fffffffe138, result=0x00007fffffffe140) at node.cc:1255:40 frame #45: 0x0000555557a0f517 node`node::Start(argc=4, argv=0x00007fffffffe2b8) at node.cc:1291:32 frame #46: 0x0000555559411d7e node`main(argc=4, argv=0x00007fffffffe2b8) at node_main.cc:127:21 frame #47: 0x00007ffff7acfd0a libc.so.6`__libc_start_main + 234 frame #48: 0x00005555578c602a node`_start + 42 (lldb) ```

joyeecheung commented 2 years ago

@goloveychuk Thanks for the follow-up. Is the code causing this error related to https://github.com/nodejs/node/issues/40832#issuecomment-1155320869 by any chance?

goloveychuk commented 2 years ago

No, that error is "bus error", which was 30% error rate. I'm not able to reproduce that in current node master. We can ignore that, I think. This error I'm able to reproduce 100% of time

joyeecheung commented 2 years ago

@goloveychuk This looks like a duplicate of https://bugs.chromium.org/p/v8/issues/detail?id=13149 - can you try building with https://chromium-review.googlesource.com/c/v8/v8/+/3813068 and see if the crash goes away?

ManojKumar-1204 commented 1 year ago

// Remove global event listeners on the process object during snapshot serialization const listeners = process.listeners('uncaughtException'); process.removeAllListeners('uncaughtException');

// Take a snapshot of the application const snapshot = process.binding('v8').createSnapshot();

// Reinstall global event listeners on the process object during deserialization listeners.forEach(listener => { process.on('uncaughtException', listener); });

// Restore the application from the snapshot process.binding('v8').setFlagsFromString('--no-wasm-tier-up'); process.binding('v8').setSnapshot(snapshot);

JoshuaWise commented 1 year ago

I just want to voice my enthusiasm for supporting user-land modules in snapshots, after today I found out postcss-preset-env takes 210 ms to require() 😅