nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.66k stars 29.62k forks source link

Does node 20.x fully supports cgroups v2 ? facing memory celling problem Kubernetes cluster #52478

Open PedroFonsecaDEV opened 7 months ago

PedroFonsecaDEV commented 7 months ago

Version

20.x

Platform

No response

Subsystem

No response

What steps will reproduce the bug?

Is #47259 fixed on node js 20.x ?

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior? Why is that the expected behavior?

Node js being aware of memory and CPU available via cgroups v2

What do you see instead?

Pods running node js on my cluster running out of memory.

Additional information

No response

mcollina commented 7 months ago

In theory yes.

PedroFonsecaDEV commented 7 months ago

Hi @mcollina, could you please expand on that ?

mcollina commented 7 months ago

Node v20.x includes the libuv version with the necessary fix as far as I understand #47259. However I have not tested it.

ben-bertrands-hs commented 5 months ago

We have the same issue and are switching back to cgroup v1

mcollina commented 5 months ago

cc @santigimeno

jonesbusy commented 5 months ago

After upgrading on cgroup v2 on our OKD cluster we don't have any issue when running Node 20.13.1 apps. Only Node 18 cause issue that were mitigated using max_old_space_size

ben-bertrands-hs commented 5 months ago

Hi. Below output for cgroupv1 vs cgroupv2 for the same container image and both with a memory limit of 320Mi

This is on a node using cgroupv1

/app $ node -e "console.log(v8.getHeapStatistics())" { total_heap_size: 4407296, total_heap_size_executable: 262144, total_physical_size: 3936256, total_available_size: 268868568, used_heap_size: 3652248, heap_size_limit: 271581184, malloced_memory: 262296, peak_malloced_memory: 108144, does_zap_garbage: 0, number_of_native_contexts: 1, number_of_detached_contexts: 0, total_global_handles_size: 8192, used_global_handles_size: 2240, external_memory: 1342981 } /app $ node -v v20.13.1 /app $ node -e "console.log(process.availableMemory())" 204288000

while this is on a cgroupv2 node

/app $ node -e "console.log(v8.getHeapStatistics())" { total_heap_size: 4407296, total_heap_size_executable: 262144, total_physical_size: 3936256, total_available_size: 2195102632, used_heap_size: 3652280, heap_size_limit: 2197815296, malloced_memory: 262296, peak_malloced_memory: 108144, does_zap_garbage: 0, number_of_native_contexts: 1, number_of_detached_contexts: 0, total_global_handles_size: 8192, used_global_handles_size: 2240, external_memory: 1342981 } /app $ node -v v20.13.1 /app $ node -e "console.log(process.availableMemory())" 8880730112

Is something else going on here?

ben-bertrands-hs commented 5 months ago

@mcollina can you reopen this issue?

rescomms-tech commented 1 month ago

@ben-bertrands-hs By a chance, are you running on alpine linux? AFAIK its node.js bulid links on alpine-supplied libuv, which, until recently, was fairly old one.

ben-bertrands-hs commented 1 month ago

@rescomms-tech yes, we are running alpine linux (node:20.17.0-alpine) I just checked and we still have this issue when switching to cgroupv2

doing a find on the container only returns these:

~ $ find / -type f -regex '.*uv.*' find: /proc/tty/driver: Permission denied find: /root: Permission denied /usr/local/include/node/uv/aix.h /usr/local/include/node/uv/bsd.h /usr/local/include/node/uv/darwin.h /usr/local/include/node/uv/errno.h /usr/local/include/node/uv/linux.h /usr/local/include/node/uv/os390.h /usr/local/include/node/uv/posix.h /usr/local/include/node/uv/sunos.h /usr/local/include/node/uv/threadpool.h /usr/local/include/node/uv/tree.h /usr/local/include/node/uv/unix.h /usr/local/include/node/uv/version.h /usr/local/include/node/uv/win.h /usr/local/include/node/uv.h

Checking /usr/local/include/node/uv/version.h gives me version 1.46.0. Every version from 1.45 on should support cgroupv2 right?