nodejs / build

Better build and test infra for Node.
504 stars 165 forks source link

Path to adding a Tier 2 build with pointer compression enabled #3204

Closed cjihrig closed 2 weeks ago

cjihrig commented 1 year ago

This is a follow up to https://github.com/nodejs/TSC/issues/790. The project currently creates unofficial builds for Linux with pointer compression enabled. With pointer compression enabled, the main branch appears to pass the test suite (at least on macOS where I tested). From what I can tell, v19 and earlier are not in such good shape.

We are interested in getting pointer compression into the CI, and eventually hopefully get it to Tier 2 status (on Linux would be enough). cc: @nodejs/releasers.

I think the first step would be getting this into the CI so that we can have more confidence in the current unofficial build. Unfortunately, I don't have the slightest idea what needs to happen to add a new CI target - but I am willing to do the work if someone can provide guidance.

cc: @mhdawson @mcollina

mhdawson commented 1 year ago

I think two questions we should get feedback on to start are:

1) @nodejs/tsc would it be ok to have pointer compression only enabled for Linux 2) @nodejs/releasers any concern in respect to adding a compressed pointer release to the shipping binaries?

RafaelGSS commented 1 year ago

As a Node.js TSC and Releaser, I'm +1. I don't see it impacting the release workflow directly. However, might make sense to produce a release candidate for LTS versions first (optional)

richardlau commented 1 year ago
  1. @nodejs/releasers any concern in respect to adding a compressed pointer release to the shipping binaries?

Yes, I have concerns about confusion to end users in producing a second Linux download, particularly as I seem to recall concerns about ABI compatibility between regular builds and pointer compression enabled ones (think about ecosystem packages using prebuilds, for example).

cjihrig commented 1 year ago

@richardlau would you feel better about only adding pointer compression to the CI so that we can at least have more confidence in the existing unofficial build?

richardlau commented 1 year ago

@cjihrig my response was specifically to the quoted question. I'm neutral on adding another job to the CI.

mhdawson commented 1 year ago

Adding another job to the CI seems resonable to me as we already have machines of the required type available and my first guess is that we would not need additional machines to support the one extra job.

cjihrig commented 1 year ago

@mhdawson so what would be the next step to add the CI job? I'm happy to help, but don't know where/how to get started.

mhdawson commented 1 year ago

I don't see any objections to adding a CI job, so I'll set up a job Colin can edit/work on and then we can take it from there. It probably will mostly be a clone of some other job.

mhdawson commented 1 year ago

Added this job with @cjihrig - https://ci.nodejs.org/job/node-test-commit-linux-pointer-compression.

Just letting it run to confirm it passes and then will add to the daily main run.

mhdawson commented 1 year ago

To answer a question people might ask we did not add as a containerized build as I was not sure how to do that and only have it run in the nightly.

mhdawson commented 1 year ago

Configured the job so that @cjihrig can edit/modify the job in cases changes/fixes are needed.

mhdawson commented 1 year ago

Had to fixup the addition to the daily main job as it did not run today. Believe it should run next time.

cjihrig commented 1 year ago

@mhdawson thanks for the help getting the pointer compression job into the daily job. It looks like it has passed every time it has run for the past few weeks with the exception of one time when GitHub changed the key.

As a next step, does it make sense to add pointer compression to the CI for pull requests on main (from my testing, v19 and older do not work properly with pointer compression)?

mhdawson commented 1 year ago

The part about only running on v19 and later is one that may require something special. @richardlau I don't think the VersionSelectorSecript can help us on this one right?

richardlau commented 1 year ago

That is exactly what the VersionSelectorScript is intended for. However it works on labels (specifically on a build parameter named "nodes"), so you may need to rework the job and/or introduce new labels.

cjihrig commented 1 year ago

@mhdawson I don't really know what the VersionSelectorScript is, but is it possible to use it to try to move this forward? Happy to help however I can.

mhdawson commented 1 year ago

@cjihrig maybe we can get together and work on it together for an hour or so. Could you send an invitation with some candidate times?

richardlau commented 1 year ago

FWIW https://ci.nodejs.org/job/node-test-commit-linux-pointer-compression has been broken since last week.

cjihrig commented 1 year ago

@mhdawson will do.

has been broken since last week

All of the failures appear to be test/parallel/test-experimental-shared-value-conveyor.js (recently added by the V8 team for an experimental feature). Easy enough to skip when pointer compression is enabled, but FYI @syg.

jdmarshall commented 1 year ago

I'm not sure I understand why pointer compression would be a tier 2 build instead of the default.

How many people are using 4GB of memory in Node, per thread? They're the special case, not the rest of us who are using a handful of gigabytes per process.

kkocdko commented 11 months ago

The build after v21 seems broken. https://unofficial-builds.nodejs.org/download/release/v21.1.0/node-v21.1.0-linux-x64-pointer-compression.tar.xz

[kkocdko@klf bin]$ ldd --version
ldd (GNU libc) 2.37
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
[kkocdko@klf bin]$ uname -a
Linux klf 6.4.14-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Sep  2 16:36:06 UTC 2023 x86_64 GNU/Linux
[kkocdko@klf bin]$ ./node 
Segmentation fault (core dumped)
[kkocdko@klf bin]$ pwd
/home/kkocdko/misc/node-v21.1.0-linux-x64-pointer-compression/bin
[kkocdko@klf bin]$ 

The pointer-compression v21.0.0 error same as above.

The pointer-compression v20.9.0 works fine.

Official v21.1.0 works fine.

kvakil commented 11 months ago

The build after v21 seems broken. https://unofficial-builds.nodejs.org/download/release/v21.1.0/node-v21.1.0-linux-x64-pointer-compression.tar.xz

The build was broken by the V8 update. I put up a fix here: https://github.com/nodejs/node/pull/50680

mikeduminy commented 5 months ago

I'd love to be able to use an official build that enables pointer compression on mac. Would that need a separate build?

mcollina commented 5 months ago

yes

ronag commented 2 months ago

Did we give up on pointer compression? I've been unable to try it out. Haven't been able to find a recent build nor have been able to figure out how to create a build myself. Hard to give feedback (for anyone) with the current state of things.

richardlau commented 2 months ago

Did we give up on pointer compression? I've been unable to try it out. Haven't been able to find a recent build nor have been able to figure out how to create a build myself. Hard to give feedback (for anyone) with the current state of things.

Nobody appears to be paying any attention to it.

jasnell commented 2 months ago

No, we haven't given up and definitely still are paying attention. Cloudflare has engaged a contract with Igalia to look at a path that hopeful would enable, in part, Node.js to enable pointer compression by default in the future.

Specifically, one of the limiting factors with enabling pointer compression to this point has been the strict limit of a single 4gb pointer cage per-process. Limiting the entire process to a single 4gb limit would be too much of a breaking change.

To support the alternative configuration we have in Cloudflare Workers, for a different use case that touches on the same parts of v8, Igalia is working on a way to enable multiple pointer cages per process. We call the new API an IsolateGroup. Each IsolateGroup represents one cage, with N Isolates allowed per IsolateCage. A process would allow any number of IsolateGroups to be created. Each IsolateGroup still has the 4gb limit but the ability to create multiple cages per process is a lot more tractable than a single 4gb cage for the entire process.

While it would still definitely be a semver-major and potentially breaking change, I think this approach would make it far more likely that we can enable pointer compression by default in Node.js or at least provide a release distribution with pointer compression enabled.

ronag commented 2 months ago

Specifically, one of the limiting factors with enabling pointer compression to this point has been the strict limit of a single 4gb pointer cage per-process.

FWIW this is not a problem for us. I don't think it's a problem for a majority of users. IMHO two separate builds would be fine. Advanced users can use the pointer compression build.

mcollina commented 2 months ago

In the past, I thought we (Platformatic) could sponsor some of this infra work, but startup life got in the way, and this is not a priority anymore. We would definitely use a Node.js build with pointer compression enabled, but the main driver for investing to make it happen is gone.

jasnell commented 2 months ago

My hope is that the new IsolateGroups will at least make it less of a potentially breaking change. We have users, however, that depend on extremely large heap sizes that, even with compression, go well beyond the 4gb limit so we have to be careful. Having the IsolateGroup API there should also make it easier for us to transition to this mode in general.

SeanReece commented 2 months ago

FWIW this is not a problem for us. I don't think it's a problem for a majority of users.

At the risk of beating a dead horse here I have to agree. The max-old-space-size defaults to 2GB on 64bit systems and I’m willing to bet most node developers have not had a use case to increase this. Even having pointer compression be the default would increase the default heap limit 2x and advanced developers could just switch to the non-compressed build for use cases requiring more heap space.

What @jasnell is working on sounds very interesting and promising. But I hope we can get the pointer compression build back on track regardless.

jdmarshall commented 1 month ago

@jasnell What do you mean by well beyond 4GB? If you have lots of object graphs it's entirely possible for 6GB heap to comfortably fit into 4GB with compression enabled.

Also I haven't heard anybody petitioning for eliminating 64 bit pointers. The argument is which is the default case and which is the special case. Right now it's very much backward from reality.

jdmarshall commented 1 month ago

FWIW this is not a problem for us. I don't think it's a problem for a majority of users. IMHO two separate builds would be fine. Advanced users can use the pointer compression build.

@ronag I would take the opposite conclusion from this. If 4GB cage isn't a problem for most users, then the advanced users should be the ones looking at the uncompressed build, not the other way around.

jasnell commented 1 month ago

I've talked with folks that run in excess of 20gb heaps in certain extreme cases. I think Netflix has also seen this but I'd need to confirm again. I do think the excessively sized heaps are not the norm and experience has shown that the vast majority of cases can fit well within a 4gb compressed heap.

jdmarshall commented 2 weeks ago

@cjihrig completed?

cjihrig commented 2 weeks ago

No. Clearing my issues list. If anyone needs this issue, please open a new one.