hassio-addons / addon-node-red

Node-RED - Home Assistant Community Add-ons
https://addons.community
MIT License
513 stars 117 forks source link

After update to v17.0.0 NodeRed does not start, cpu usage at max #1793

Closed drftg closed 5 months ago

drftg commented 5 months ago

After update to v17.0.0 my cpu got hot without any working automations.

No matter the loglevel, the only log I can catch is

s6-rc: info: service s6rc-oneshot-runner: starting s6-rc: info: service s6rc-oneshot-runner successfully started s6-rc: info: service base-addon-banner: starting

Looking at the processes I see this:

root@a0d7b954-nodered:/opt$ ps PID USER TIME COMMAND 1 root 0:00 /package/admin/s6/command/s6-svscan -d4 -- /run/service 15 root 0:00 {rc.init} /bin/sh -e /run/s6/basedir/scripts/rc.init top 16 root 0:00 s6-supervise s6-linux-init-shutdownd 18 root 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B 25 root 0:00 s6-supervise s6rc-oneshot-runner 26 root 0:00 s6-supervise s6rc-fdholder 27 root 0:00 s6-supervise nginx 28 root 0:00 s6-supervise nodered 32 root 0:00 s6-rc -v2 -u -t 0 -- change top 36 root 0:00 /package/admin/s6/command/s6-ipcserverd -1 -- /package/admin/s6/command/s6-ipcserver-access -v0 -E -l0 -i data/rules -- /package/admin/s6/command/s6-sudod -t 30000 -- /package/admin/s6-rc/command/s6-rc-oneshot-run -l ../.. -- 37 root 0:00 /package/admin/s6-2.12.0.2/command/s6-sudoc -e -t 30000 -T 0 -- up 7 38 root 0:00 /package/admin/s6/command/s6-sudod -t 30000 -- /package/admin/s6-rc/command/s6-rc-oneshot-run -l ../.. -- 39 root 0:00 bash /usr/bin/bashio /package/admin/s6-overlay/etc/s6-rc/scripts/base-addon-banner 44 root 2:44 curl --silent --show-error --write-out \n%{http_code} --request GET -H Authorization: Bearer 7376a2d7c5ee13da2239a920c4e3bb1fd0b028fa35401f492fcc7a6e35f8057bca20f3b715a32e55d4d3f31045d7338f15f981a8f39bcf63 -H Content-Type: application/json -d {} http://supervisor/supervisor/ping 45 root 0:00 /bin/sh -c curl --fail http://127.0.0.1:46836 || exit 1 52 root 2:17 curl --fail http://127.0.0.1:46836 74 root 0:00 /bin/sh -c curl --fail http://127.0.0.1:46836 || exit 1 81 root 1:18 curl --fail http://127.0.0.1:46836 82 root 0:00 /bin/bash 95 root 0:00 /bin/sh -c curl --fail http://127.0.0.1:46836 || exit 1 101 root 0:18 curl --fail http://127.0.0.1:46836 106 root 0:00 ps

PID 44 takes about 20% cpu. After it started multiple processes curl --fail http://127.0.0.1:46836 the cpu is at max and I have to stop the container.

I can ping supervisor. wget http://supervisor/supervisor/ping gets a result OK wget http://127.0.0.1:46836 results in "Connection Refused"

What else can I try to get more information about why cpu is at max?

Steps to reproduce

I am running HA on a Synology NAS

Supervisor 2023.12.1 Home Assistant 2024.1.2

frenck commented 5 months ago

pid 44 is a very standard/simple ping request... I have no clue why that would even take long.

drftg commented 5 months ago

My guess is it has something to do with the version of curl 8.5.0. When executing the PID 44 command from shell I get no result (just hangs). When I try that command from the add-on vscode (v5.15.0) which has curl v7.88.1 the command returns {result: "ok", data: {}}

mmarcines commented 5 months ago

Hi,

I've noticed same problem when nodered17.0.0 was published but didn't have time to report it. Now we have 17.0.2 - but for me issue is exact the same.

After upgrading to 17.0.0 or 17.0.2 - nodered addon will never start. HomeAssistant supervised is running as docker images on Synology DSM. Same as nodered addon HomeASsistant Core 2024.1.2 Supervisor 2023.12.0

When addon nodered is running 16.0.2 release then it is running stable for long time without any issues. When running nodered16.0.2 for just few minutes (3-4 minutes) CPU and memory load is low (number of processes is relatively small). Curl version is 8.4.0 and then curl --fail http://127.0.0.1:46836 command is returning a license agreement.

When I upgrade nodered to 17.0.2 (now) it is starting for longer time (and actually it never starts completely). After 12-13 minutes CPU usage is so high that container is killed by docker and it is booting again. After those 12-13 minutes CPU usage is HIGH and mostly consumed by curl processes. On nodered17.0.2 Curl version is 8.5.0 and indeed curl --fail http://127.0.0.1:46836 is not returning any result (it hangs?)

I've tried this command curl --fail http://127.0.0.1:46836 from z-wave container (where curl is 8.5.0 ) and at least there I"m receving an error: "Connection refused".

Attached are: 1) log from booting nodered 16.0.2 (addon_a0d7b954_nodered-16.0.2.txt) 2) log from booting nodered 17.0.2 (addon_a0d7b954_nodered-17.0.2.txt) 3) screenshots for nodered16.0.2 showing top, ps from within container as from Docker point of view (nodered-16.0.2-DSM.jpg, nodered16.0.2-top.jpg, nodered16.0.2-ps.jpg, nodered16.0.2-DSM-ps.jpg amd nodered16.0.2-curl.jpg). 4) screenshots for nodered17.0.2 showing top, ps from within container as from Docker point of view ( nodered-17.0.2-DSM-ps.jpg, nodered-17.0.2-DSM.jpg, nodered17.0.2-curl.jpg, nodered17.0.2-ps.jpg, nodered17.0.2-top.jpg) 5) screenshots showing curl command executed from zwave container (zwave-curl).

Would like to help more and provide any additional details - as I would like to keep nodered up to date as I'm using it and autoupdating it since 14.0.1 release (at least this is the oldest nodered backup which I have) and untill 17.0.0 release there was no issues.

Please let me know if anything else is needed to further analyze this issue?

zwave-curl nodered17 0 2-top nodered17 0 2-ps nodered17 0 2-curl nodered-17 0 2-DSM nodered-17 0 2-DSM-ps nodered16 0 2-curl nodered-16 0 2-DSM-ps nodered16 0 2-ps nodered16 0 2-top nodered-16 0 2-DSM addon_a0d7b954_nodered-17.0.2.txt addon_a0d7b954_nodered-16.0.2.txt

drftg commented 5 months ago

This seems to be a problem specific to Synology. See https://info.linuxserver.io/issues/2023-12-30-synology/ The workaround for it seems to be effective. In a shell within the docker container (before is overheats) run apk add c-ares -u --repository=https://dl-cdn.alpinelinux.org/alpine/edge/main After restarting the container the Node-RED addon starts up

mmarcines commented 5 months ago

Hi,

THanks for pointing us to issue with curl but this still is not solving the issue.

After upgrading to nodered17.0.2 I've checked c-ares module version (it was 1.22) After issuing apk add command c-ares was upgraded to 1.24

Once c-ares is at 1.24 - curl (curl --fail http://127.0.0.1:46836) is not hanging anymore - but there is an info "could not connect to server" - whereas in nodered16.0.2 same curl command is returing OpenJS foundataion lic agreement.

After restarting container - c-ares version is reverting back to 1.22 causing curl command to hang again (and causing very high CPU usage). c-ares-after-container-restart

Having 17.0.2 with c-ares upgraded (without reboot and with container reboot) - all the time container is not booting completed (correctly). Only those 3 steps are visible in the log:

stderr,s6-rc: info: service base-addon-banner: starting stderr,s6-rc: info: service s6rc-oneshot-runner successfully started stderr,s6-rc: info: service s6rc-oneshot-runner: starting

@drftg is this workaround working for you and nodered17.0.2 is booting properly on your Synology NAS ?

drftg commented 5 months ago

@mmarcines

Indeed, unfortunately things went bad for me too in version 17.0.2 and the addon keeps stopping itself

-----------------------------------------------------------
 Add-on version: 
 You are running the latest version of this add-on.
 System:   ( / )
 Home Assistant Core: 
 Home Assistant Supervisor: 
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service base-addon-banner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service base-addon-log-level: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service base-addon-log-level successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service init-nginx: starting
s6-rc: info: service init-customizations: starting
2024/01/15 19:15:50 invalid character '}' looking for beginning of value
s6-rc: warning: unable to start service init-nginx: command exited 1
s6-rc: info: service init-customizations successfully started
s6-rc: info: service init-nodered: starting
[19:15:51] WARNING: 
[19:15:51] WARNING: Direct access mode is disabled, Node-RED Dashboard
[19:15:51] WARNING: will not work!
[19:15:51] WARNING: 
[19:15:51] WARNING: Please assign a port in the Network section of this
[19:15:51] WARNING: add-on configuration.
[19:15:51] WARNING: 

up to date, audited 1 package in 1s

found 0 vulnerabilities
s6-rc: info: service init-nodered successfully started
s6-rc: info: service nodered: starting
s6-rc: info: service nodered successfully started
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.
/run/s6/basedir/scripts/rc.init: fatal: stopping the container.
s6-rc: info: service nodered: stopping
[19:15:53] INFO: Service Node-RED exited with code 256 (by signal 15)
s6-rc: info: service nodered successfully stopped
s6-rc: info: service init-nodered: stopping
s6-rc: info: service init-nodered successfully stopped
s6-rc: info: service init-customizations: stopping
s6-rc: info: service init-customizations successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service base-addon-log-level: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service base-addon-log-level successfully stopped
s6-rc: info: service base-addon-banner: stopping
s6-rc: info: service base-addon-banner successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped
drftg commented 5 months ago

Hmmmm

Patched again, restarted again and now it seems to work again. This is not very stable (but I am not touching it again for now!)

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service base-addon-banner: starting
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service base-addon-banner: starting

-----------------------------------------------------------
 Add-on: Node-RED
 Flow-based programming for the Internet of Things
-----------------------------------------------------------
 Add-on version: 17.0.2
 You are running the latest version of this add-on.
 System: null  (amd64 / qemux86-64)
 Home Assistant Core: 2024.1.3
 Home Assistant Supervisor: 2023.12.1
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service base-addon-banner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service base-addon-log-level: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service base-addon-log-level successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service init-nginx: starting
s6-rc: info: service init-customizations: starting
s6-rc: info: service init-customizations successfully started
s6-rc: info: service init-nodered: starting
s6-rc: info: service init-nginx successfully started

up to date, audited 1 package in 1s

found 0 vulnerabilities
s6-rc: info: service init-nodered successfully started
s6-rc: info: service nodered: starting
s6-rc: info: service nodered successfully started
s6-rc: info: service nginx: starting
s6-rc: info: service nginx successfully started
s6-rc: info: service legacy-services: starting
[19:28:43] INFO: Starting Node-RED...
s6-rc: info: service legacy-services successfully started

> start
> node $NODE_OPTIONS node_modules/node-red/red.js --settings /etc/node-red/config.js

15 Jan 19:28:51 - [info] 

Welcome to Node-RED
===================

15 Jan 19:28:51 - [info] Node-RED version: v3.1.3
15 Jan 19:28:51 - [info] Node.js  version: v18.19.0
15 Jan 19:28:51 - [info] Linux 3.10.108 x64 LE
15 Jan 19:28:52 - [info] Loading palette nodes
15 Jan 19:28:55 - [info] Node-RED Contrib Theme Collection version: v3.1.8
15 Jan 19:29:04 - [info] Dashboard version 3.6.2 started at /endpoint/ui
15 Jan 19:29:09 - [info] Settings file  : /etc/node-red/config.js
15 Jan 19:29:09 - [info] Context store  : 'default' [module=memory]
15 Jan 19:29:09 - [info] User directory : /config/
15 Jan 19:29:09 - [warn] Projects disabled : editorTheme.projects.enabled=false
15 Jan 19:29:09 - [info] Flows file     : /config/flows.json
15 Jan 19:29:09 - [warn] 
mmarcines commented 5 months ago

@drftg by saying patched you mean APK add command for c-ares? I've tried that five times during the day and all attemps ended at

s6-rc: info: service base-addon-banner: starting

And no further logs (and highe CPU).

After restarting container c-ares was reverting back to originał release causing curl command to hang.

How are you restarting container so c-ares is not reverting to initial release?

drftg commented 5 months ago

@mmarcines: restart using the docker gui provided by synology. Do not do it via HA as that initiates a new instance. I think that is why you are seeing a revert to the initial version of c-ares

mmarcines commented 5 months ago

I've rebooted container with reboot command (executed from inside container) and the other option used by me was Docker interface in DSM (restarting as shown on screenshot). Both causing c-ares being revertet to initial version Screenshot_20240115-195349

drftg commented 5 months ago

That is exactly how I did it. My Synology: DS916+ DSM 7.1.1-42962 Update 6

mmarcines commented 5 months ago

Weird. Have the same hardware and software....

mmarcines commented 5 months ago

Today I've upgraded HomeASsistant Supervisor to 2023.12.1 and nodered to 17.0.3 - and it looks like the issue no longer exists. Nodered addon is booting properly and after its uptime is already 15 minutes and CPU usage is only 0,5% ;-)

drftg commented 5 months ago

I did a "normal" update to 17.0.4 and everything seems to work without hiccups