nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.92k stars 29.75k forks source link

Usage of Oilpan / V8 C++ Garbage Collector #40786

Open mcollina opened 3 years ago

mcollina commented 3 years ago

I have recently read the article about Olipan: https://v8.dev/blog/oilpan-library.

It might be worthwhile investigating if it's something we could use inside Node.js C++ internals to simplify some of our memory management and wrapping logic.

mcollina commented 3 years ago

cc @nodejs/v8 @addaleax @jasnell

joyeecheung commented 3 years ago

I think it's worth giving a try. IIUC, we didn't use EmbedderHeapTracer because that only works with BaseObjects that called MakeWeak(), Oilpan is more powerful so it should work with other BaseObjects (and other types of native objects that aren't using BaseObject) - I am not sure if this would necessarily make our memory management simpler though, since Oilpan itself seems more complex than what we use right now to track our native objects.

mlippautz commented 3 years ago

If you have some bindings use cases that are problematic to get right, or if you just want a GCed C++ object model, then Oilpan may indeed be a great fit.

I'd say the only caveat right now with Oilpan shipped through V8 is that it doesn't (yet) come with containers. std containers are supported but require using non-incremental tracing for the embedder (--no-incremental-marking-wrappers as of today but the name may change).

@joyeecheung You are right, EmbedderHeapTracer doesn't actually manage C++ memory which is why you generally need MakeWeak() as well. Oilpan is a full-fledged C++ GC, so yes, it is more complicated. However, none of that should be a concern to the embedder. The important piece there is that both types of references, V8->embedder and embedder->V8, are supported by default with V8 so there's no manual management involved there.

SalvatorePreviti commented 2 years ago

I tried to used Oilpan (cppgc) in a node addon (node 16 and node 17 and also nightly build), but it seems to not be fully linked/exposed. While I can compile a node addon using it, I get runtime linking errors when I try to use it.

Error: dlopen(/xxxxxxxxxx/singularity.node, 0x0001): symbol not found in flat namespace '__ZN5cppgc8internal21RegisteredGCInfoIndexC1EPFvPvEb' at Object.Module._extensions..node (node:internal/modules/cjs/loader:1179:18)

Is there a plan to support it?

mcollina commented 2 years ago

Is there a plan to support it?

Not at this point. I'm not sure what it would entail for our ABI stability.

SalvatorePreviti commented 2 years ago

Makes sense. It seems Google is stabilising the cppgc API. Please, keep us updated, I am really interested in the possibility to use it on node add-ons :) Thank you.

mcollina commented 2 years ago

@SalvatorePreviti if you would like to open a PR, we can discuss it there.

SalvatorePreviti commented 2 years ago

A small update, tried to build locally the latest version of node on master and write a unit test in test/addons - it seems that with that oilpan is properly linked and exposed and can be used. However, cppgc::DefaultPlatform::InitializeProcess is not called yet by anybody, and this should be called once per process to be able to create an heap, and the platform should be exposed. Thinking to add it to ExecuteBootstrapper

For reference, this work in progress branch https://github.com/SalvatorePreviti/node/tree/expose-oilpan

addaleax commented 2 years ago

However, cppgc::DefaultPlatform::InitializeProcess is not called yet by anybody, and this should be called once per process to be able to create an heap, and the platform should be exposed. Thinking to add it to ExecuteBootstrapper

This should probably go into InitializeOncePerProcess() if it is comparable to V8::Initialize() in when it should be called :+1:

SalvatorePreviti commented 2 years ago

Interestingly, there is deps/v8/include/v8_cppgc.h currently not exported by node.

This seems to be targeted to be used for a common GC heap between JS and C++. https://chromium.googlesource.com/v8/v8.git/+/HEAD/include/v8-cppgc.h

AttachCppHeap in v8-isolate.h is marked as experimental https://chromium.googlesource.com/v8/v8.git/+/HEAD/include/v8-isolate.h#940

From a first look it seems to work as a replacement to cppgc::DefaultPlatform::InitializeProcess

At the moment I am still not able to make it work, I get a Signal: 11

SalvatorePreviti commented 2 years ago

It seems that this should wait a bit, it seems cppgc and v8_cppgc.h is currently under active development, last commit for v8_cppgc 5 days ago and for cppgc 5 hours ago. However, from my current understanding of the APIs, and if they do not change, I believe we should initialize a shared CppHeap from v8_cppgc.h at process startup and set it every time to an Isolate when it is relevant to do so (or always)

SalvatorePreviti commented 2 years ago

@mlippautz maybe can give some information or make a bit of light on this and the stability of cppgc and v8_cppgc.h?

mlippautz commented 2 years ago

cppgc has shipped in v9.4 for Blink where it's used as the production garbage collector for C++.

We are actively working on cppgc and its APIs but everything that you can find on the public API surface in e.g.include/cppgc/* or in include/v8-cppgc.h is considered stable and follows V8's general API stability. In other words, as for all other APIs, they are not set in stone but will follow the regular deprecation cycles and we will try hard to allow a smooth migration when removing things.

owinebar commented 2 years ago

@mlippautz - I've been going through the doc and source for the last few days and wonder if part of the issue is that the V8 embedding documentation is out of date. It says there are basically 2 types allowed to reference "JavaScript objects", Local and Persistent. I don't think that is true now.

First, the term "JavaScript objects" is a bit misleading. The subject should really be "objects managed by the V8 GC", of which some are owned by V8 ("proper JS objects") and others are owned by emdedders and subclassed from the GarbageCollected template. The former are only on the stack wrapped by Local handles, where the latter must never be allocated on the stack and only as raw pointers.

For now, the definitive reference for v8 embedders that want to either use GarbageCollected objects or export that capability would seem to be Blink's embedding of cppgc as provided by V8.

@SalvatorePreviti As for the initialization segfault, there is a test program "cppgc_hello_world" in v8. When built as part of building v8 targets, this program segfaults. When built with the "cppgc_is_standalone" option (which forbids building any targets other than cppgc or related tests), that test program prints the expected message. If you look at Blink's process_heap.cc, it just calls gin::InitializeCppgcFromV8Platform() which just guards against calling the initializer more than once, and calls cppgc::InitializeProcess on the first attempt.

owinebar commented 2 years ago

I opened a documentation issue on Monorail for the misalignment of the docs (including code comments) with the code. The segfault was reported a while ago as Issue 12427, but it wasn't noted that the test succeeds when cppgc is built as a standalone library.

owinebar commented 2 years ago

@mcollina The active release of node is pretty far behind V8 with respect to cppgc. The include/cppgc/README.md file in 16.13.12 is 2 years old, from when oilpan was first introduced in V8. 17.4.0 has updated that to a version from a few months ago, but some of the simplifications in the current V8 interface for putting V8 objects into managed C++ objects have not been pulled in yet. It looks like @targos has been bringing the newer cppgc code into the current node. @mlippautz has been kindly explaining how embedders can introduce "managed C++" to JS edges and vice versa at cppgc documentation issue on monorail . That use case appears to have been significantly streamlined. Now JS objects can be introduced into managed C++ objects using a Traced reference without any intervening persistent or global handle that would be treated as a root by the GC I imagine the big question for node developers is how you want to define BaseObject. Do you continue with the reference counting scheme on a global handle, or make it a managed C++ object? I would think it's only controversial to the extent it could be a barrier to supporting other JS runtimes with the same ABI. The standalone version of cppgc might ameliorate that concern, I don't know. Otherwise it seems like a huge win to me.

mcollina commented 2 years ago

@mcollina The active release of node is pretty far behind V8 with respect to cppgc. The include/cppgc/README.md file in 16.13.12 is 2 years old, from when oilpan was first introduced in V8. 17.4.0 has updated that to a version from a few months ago, but some of the simplifications in the current V8 interface for putting V8 objects into managed C++ objects have not been pulled in yet. It looks like @targos has been bringing the newer cppgc code into the current node.

@mlippautz has been kindly explaining how embedders can introduce "managed C++" to JS edges and vice versa at cppgc documentation issue on monorail . That use case appears to have been significantly streamlined. Now JS objects can be introduced into managed C++ objects using a Traced reference without any intervening persistent or global handle that would be treated as a root by the GC

This looks like a massive win!

I imagine the big question for node developers is how you want to define BaseObject. Do you continue with the reference counting scheme on a global handle, or make it a managed C++ object? I would think it's only controversial to the extent it could be a barrier to supporting other JS runtimes with the same ABI. The standalone version of cppgc might ameliorate that concern, I don't know. Otherwise it seems like a huge win to me.

We are current not supporting other JS runtimes anymore. A PoC would be really useful.

@addaleax @jasnell wdyt?

addaleax commented 2 years ago

@mcollina I think it’s fine to try this out for sure.

owinebar commented 2 years ago

On Fri, Jan 28, 2022, 3:10 AM Matteo Collina @.***> wrote:

We are current not supporting other JS runtimes anymore.

A PoC would be really useful.

For reference, I don't see the V8::object::SetAlignedPointerInInternalField method in the v8 API until version 9.5 in v8-object.h. There are a lot of new header files in that version.

Blink has moved to using the V8 cppgc interface, but they still have a lot of code in src/renderer/platform/heap for dealing with thread-local allocation using cppgc, heap allocated collections and probably other common constructions.

owinebar commented 2 years ago

Has there been any work on this?

dharesign commented 1 year ago

We are making use of cppgc in node add-ons. As a result we have modified tools/install.py to include all the cppgc related headers. We are initializing cppgc in our add-on, rather than modifying Node.js to initialize it, but we can change that. I created a PR to get the ball rolling: #45704

joyeecheung commented 1 year ago

I drafted a design doc on how to integrate Oilpan into Node.js: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit

I also did a POC by migrating CompiledFnEntry while I was testing https://github.com/nodejs/node/pull/45704, although CompiledFnEntry is probably going away anyway because of https://github.com/nodejs/node/pull/48510 but it was good exercise and helped me a bit when figuring out the design. The next target, as described in the doc, would be AliasedBuffers.

mcollina commented 1 year ago

I reviewed this and the plan is sound to fix the problem with ShadowRealm. I think it might also give us a bit more garbage collection performance.

joyeecheung commented 1 year ago

With CompiledFnEntry already gone, I went on to experiment migration on a few internal classes. I have a WIP branch to figure out a better strategy of migration and track what internal utilities need to be updated/reinvented. I will update more detailed findings in the doc, but just to update a finding of mine - I think it's better to migrate in this order (which is actually the opposite of my original plan):

There are still several helpers I need to abstract out, especially for integrations of heap snapshots/startup snapshots. I may send a PR soon-ish when I think they are stable/generic enough.

joyeecheung commented 11 months ago

Some interesting update: I noticed that moving from weak BaseObject to Oilpan-management (without cleanup queue book-keeping) is about 2.5x faster:

misc/object-wrap.js method="ExampleCppgcObject" n=1000000: 8,113,612.185256452
misc/object-wrap.js method="ExampleBaseObject" n=1000000: 3,218,022.6465318813

The BaseObject overhead can actually become bottlenecks. I opened https://github.com/nodejs/node/pull/51017 to show a POC of migrating crypto::Hash to Oilpan with a noticeable speedup with some helpers that I've used to test migration on various objects. (I think the overhead partly comes from global handle initialization and partly comes from cleanup queue book-keeping. Haven't looked deeper into it yet as those tend to be inlined)

owinebar commented 9 months ago

I drafted a design doc on how to integrate Oilpan into Node.js: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit

Thanks for this - it's useful even for non-node embedders.

joyeecheung commented 7 months ago

Some updates:

                                                                                                                                    confidence improvement accuracy (*)   (**)  (***)
vm/compile-script-in-isolate-cache.js n=1000 filename='test/fixtures/snapshot/typescript.js' type='with-dynamic-import-callback'                   -0.91 %       ±1.84% ±2.45% ±3.20%
vm/compile-script-in-isolate-cache.js n=1000 filename='test/fixtures/snapshot/typescript.js' type='without-dynamic-import-callback'                -2.79 %       ±4.24% ±5.70% ±7.55%
vm/compile-script-in-isolate-cache.js n=1000 filename='test/fixtures/syntax/good_syntax.js' type='with-dynamic-import-callback'                     3.39 %       ±4.28% ±5.70% ±7.42%
vm/compile-script-in-isolate-cache.js n=1000 filename='test/fixtures/syntax/good_syntax.js' type='without-dynamic-import-callback'         ***      6.93 %       ±3.08% ±4.11% ±5.35%
See regression numbers ``` confidence improvement accuracy (*) (**) (***) v8/deserialize-array.js n=10000 len=1024 type='array' *** -40.29 % ±1.62% ±2.15% ±2.80% v8/deserialize-array.js n=10000 len=1024 type='bigint-typed-array' *** -22.46 % ±1.11% ±1.48% ±1.93% v8/deserialize-array.js n=10000 len=1024 type='typed-array' *** -6.79 % ±3.07% ±4.11% ±5.41% v8/deserialize-array.js n=10000 len=16 type='array' *** -8.25 % ±1.63% ±2.17% ±2.83% v8/deserialize-array.js n=10000 len=16 type='bigint-typed-array' *** -9.01 % ±2.56% ±3.41% ±4.46% v8/deserialize-array.js n=10000 len=16 type='typed-array' *** -6.63 % ±2.40% ±3.22% ±4.23% v8/deserialize-array.js n=10000 len=256 type='array' *** -13.26 % ±1.07% ±1.43% ±1.86% v8/deserialize-array.js n=10000 len=256 type='bigint-typed-array' *** -12.58 % ±1.80% ±2.40% ±3.13% v8/deserialize-array.js n=10000 len=256 type='typed-array' *** -7.78 % ±2.55% ±3.41% ±4.48% v8/deserialize-values.js n=100000 type='object' *** -25.36 % ±0.82% ±1.09% ±1.42% v8/deserialize-values.js n=100000 type='string' *** -10.81 % ±0.94% ±1.25% ±1.63% v8/serialize-array.js n=10000 len=1024 type='array' *** -6.28 % ±1.14% ±1.52% ±1.98% v8/serialize-array.js n=10000 len=1024 type='bigint-typed-array' *** -23.61 % ±1.07% ±1.43% ±1.86% v8/serialize-array.js n=10000 len=1024 type='typed-array' *** -30.91 % ±1.53% ±2.05% ±2.68% v8/serialize-array.js n=10000 len=16 type='array' *** -24.23 % ±1.63% ±2.17% ±2.82% v8/serialize-array.js n=10000 len=16 type='bigint-typed-array' *** -32.05 % ±1.49% ±1.99% ±2.60% v8/serialize-array.js n=10000 len=16 type='typed-array' *** -25.83 % ±1.33% ±1.78% ±2.31% v8/serialize-array.js n=10000 len=256 type='array' *** -14.19 % ±1.44% ±1.92% ±2.51% v8/serialize-array.js n=10000 len=256 type='bigint-typed-array' *** -29.26 % ±1.51% ±2.01% ±2.63% v8/serialize-array.js n=10000 len=256 type='typed-array' *** -33.04 % ±1.52% ±2.03% ±2.66% v8/serialize-values.js n=100000 type='object' *** -22.56 % ±2.27% ±3.03% ±3.97% v8/serialize-values.js n=100000 type='string' *** -22.43 % ±1.70% ±2.27% ±2.99% ```
joyeecheung commented 3 months ago

https://github.com/nodejs/node/pull/52295 is ready for review now after rebasing on top of the V8 v12.8 upgrade, this adds a few helpers for cppgc migration and migrates ContextifyScript which is one of the very few classes in core that doesn't have any externally managed data (for classes that do, we'll need to wait for https://chromium-review.googlesource.com/c/v8/v8/+/5630497). The small performance improvement in compiling small scripts are still there (and seems more evident?) after the rebase.