nginx / njs

A subset of JavaScript language to use in nginx
http://nginx.org/en/docs/njs/
BSD 2-Clause "Simplified" License
1.14k stars 150 forks source link

High memory usage after updating to NJS 0.8.6 #795

Open xbb123 opened 3 days ago

xbb123 commented 3 days ago

Describe the bug

The NJS 0.8.6 has much higher memory footprint than 0.8.5. On our stage server, when running our NJS test cases, the previous version 0.8.5 use about 0.8% of memory. Running the same test case set, the memory usage will go up to 25% and easily gets OOM killed when running pressure tests. Referring to my previous issue: https://github.com/nginx/njs/issues/185 We have discovered that variables (lists) with high numbers of elements will consume memory quite quickly; although this may not being the only factor: removing array with long variable does not reduce the memory footprint to normal level. Downgrading to NJS 0.8.5 resolved the issue.

To reproduce

Steps to reproduce the behavior: Referring to my previous issue: https://github.com/nginx/njs/issues/185 for gist

Your NGINX logs here


Oct  8 12:53:59 Stage kernel: [1096221.064717] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=systemd-networkd.service,mems_allowed=0,global_oom,task_memcg=/system.slice/nginx.service,task=nginx,pid=1536276,uid=0
Oct  8 12:53:59 Stage kernel: [1096221.064737] Out of memory: Killed process 1536276 (nginx) total-vm:2945532kB, anon-rss:1742516kB, file-rss:2304kB, shmem-rss:0kB, UID:0 pgtables:5028kB oom_score_adj:0
Oct  8 12:53:59 Stage kernel: [1096221.448365] systemd[1]: snapd.service: Watchdog timeout (limit 5min)!
Oct  8 12:53:59 Stage kernel: [1096221.453954] systemd[1]: snapd.service: Killing process 1433291 (snapd) with signal SIGABRT.
Oct  8 12:54:00 Stage kernel: [1096221.779352] systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE
Oct  8 12:54:00 Stage kernel: [1096221.851480] systemd[1]: session-59088.scope: Deactivated successfully.
Oct  8 12:54:00 Stage kernel: [1096221.856485] systemd[1]: session-59088.scope: Consumed 18.556s CPU time.
Oct  8 12:54:00 Stage kernel: [1096222.067511] systemd[1]: snapd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Oct  8 12:54:00 Stage kernel: [1096222.067523] systemd[1]: snapd.service: Failed with result 'watchdog'.
Oct  8 12:54:00 Stage kernel: [1096222.068074] systemd[1]: snapd.service: Consumed 16.648s CPU time.
Oct  8 12:54:00 Stage kernel: [1096222.206459] systemd[1]: snapd.service: Scheduled restart job, restart counter is at 2.
Oct  8 12:54:00 Stage kernel: [1096222.206683] systemd[1]: Stopped Snap Daemon.
Oct  8 12:54:00 Stage kernel: [1096222.206736] systemd[1]: snapd.service: Consumed 16.648s CPU time.
Oct  8 12:54:00 Stage kernel: [1096222.217663] systemd[1]: Starting Snap Daemon...
Oct  8 12:54:01 Stage kernel: [1096223.505220] systemd[1]: systemd-journald.service: Main process exited, code=dumped, status=6/ABRT
Oct  8 12:54:01 Stage kernel: [1096223.505323] systemd[1]: systemd-journald.service: Failed with result 'watchdog'.
Oct  8 12:54:01 Stage kernel: [1096223.506197] systemd[1]: systemd-journald.service: Consumed 4min 50.232s CPU time.
Oct  8 12:54:01 Stage kernel: [1096223.506913] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 1.
Oct  8 12:54:01 Stage kernel: [1096223.507648] systemd[1]: Stopped Journal Service.
Oct  8 12:54:01 Stage kernel: [1096223.507720] systemd[1]: systemd-journald.service: Consumed 4min 50.232s CPU time.
Oct  8 12:54:01 Stage kernel: [1096223.514640] systemd[1]: Starting Journal Service...
Oct  8 12:54:02 Stage kernel: [1096223.587813] systemd-journald[1536807]: File /var/log/journal/b3a5fdbd1f9c4041acc272a500182f33/system.journal corrupted or uncleanly shut down, renaming and replacing.
Oct  8 12:54:02 Stage kernel: [1096223.783435] systemd[1]: Started Journal Service.
Oct  8 12:54:02 Stage systemd[1]: nginx.service: A process of this unit has been killed by the OOM killer.
Oct  8 12:54:02 Stage systemd[1]: nginx.service: Main process exited, code=killed, status=9/KILL
Oct  8 12:54:02 Stage sh[1536724]: kill: (1536276): No such process
Oct  8 12:54:02 Stage systemd[1]: systemd-journald.service: Watchdog timeout (limit 3min)!
Oct  8 12:54:02 Stage systemd[1]: systemd-journald.service: Killing process 309 (systemd-journal) with signal SIGABRT.

### Expected behavior

A clear and concise description of what you expected to happen.

### Your environment

- Version of njs or specific commit 0.8.6
- Version of NGINX if applicable 1.27.2
- List of other enabled nginx modules if applicable
- OS: Ubuntu 24.04 & 22.04 exactly the same behavior 

### Additional context

Thank you!
xeioex commented 1 day ago

Hi @xbb123,

I believe I found the relevant issue. Could you, please, test that it fixes your problem? See https://github.com/nginx/njs/pull/798.