turing-machines / BMC-Firmware

Turing-pi BMC firmware
GNU General Public License v2.0
215 stars 26 forks source link

bmcd logging to console causes strange behavior if controlling terminal vanished #185

Open j0ju opened 4 months ago

j0ju commented 4 months ago

Describe the bug If the controlling TTY of the bmcd process vanishes it behaves strangly on actions.

To Reproduce Steps to reproduce the behavior:

  1. Log into the bmc via ssh
  2. Stop bmcd via ssh
  3. Start bmcd via ssh
  4. Log out of the ssh session, ensure it is closes (No Control Master or similar features is in use)
  5. Use tpi advanced msd -n 2 NOTE: the node number does not matter. I started the process here by hand to demonstrate. The call to start-stop-daemon in /etc/init.d/S94bmcd backgrounds bmcd but keeps it attached to the local console.
    
    tpi: /media/images/rk1 # /usr/bin/bmcd --config /etc/bmcd/config.yaml & pid=$!
    2024-02-29T23:26:01.430Z INFO  [bmcd] Turing Pi 2 BMC Daemon v2.0.5

tpi: /media/images/rk1 # tpi advanced -n 2 msd 2024-02-29T23:26:23.406Z INFO [bmcd::app::bmc_application] Powering off node Node2... 2024-02-29T23:26:24.012Z INFO [bmcd::app::bmc_application] Prerequisite settings toggled, powering on... 2024-02-29T23:26:25.500Z INFO [bmcd::app::bmc_application] Checking for presence of a USB device... 2024-02-29T23:26:25.512Z INFO [bmcd::firmware_update::rockusb_fwudate] Maskrom mode detected. loading usb-plug.. ok

tpi: /media/images/rk1 # echo $pid 24004 tpi: /media/images/rk1 # ls /proc/$pid/fd/[012] -l lrwx------ 1 root root 64 Feb 29 23:28 /proc/24004/fd/0 -> /dev/pts/3 lrwx------ 1 root root 64 Feb 29 23:28 /proc/24004/fd/1 -> /dev/pts/3 lrwx------ 1 root root 64 Feb 29 23:28 /proc/24004/fd/2 -> /dev/pts/3

[ terminal closed ] [ fresh terminal ] tpi: /media/images/rk1 # pid=24004 tpi: /media/images/rk1 # echo $pid 24004 tpi: /media/images/rk1 # ls /proc/$pid/fd/[012] -l lrwx------ 1 root root 64 Feb 29 23:30 /proc/24004/fd/0 -> '/dev/pts/3 (deleted)' lrwx------ 1 root root 64 Feb 29 23:30 /proc/24004/fd/1 -> '/dev/pts/3 (deleted)' lrwx------ 1 root root 64 Feb 29 23:30 /proc/24004/fd/2 -> '/dev/pts/3 (deleted)'

tpi: /media/images/rk1 # tpi advanced -n 2 msd error sending request for url (https://127.0.0.1/api/bmc?opt=set&type=node_to_msd&node=1): connection closed before message completed

tipi > ~ > tpi advanced -n 2 normal ok

tpi: /media/images/rk1 # tpi advanced -n 2 msd error sending request for url (https://127.0.0.1/api/bmc?opt=set&type=node_to_msd&node=1): connection closed before message completed


**Expected behavior**
The operation of  `tpi advanced msd -n 2` should work as expected.

**Versions**
linux version= Linux tpi 5.4.61 #4 SMP PREEMPT Sun Jan 28 13:47:02 UTC 2024 armv7l GNU/Linux
bmc version= 2024-02-28T23:49:15.568Z INFO [bmcd] Turing Pi 2 BMC Daemon v2.0.5

**Additional context**
It seems to be related that the filedescriptors of STDOUT/ERR are closed/vanished when the pseudo terminal of the SSH session vanishes.
If the bmcd is doint output, eg. the state messages regarding the usb devices for msd mode, that it aborts the action without notice.

It would be nice if bmcd would log to syslog. So the messages can the be forwarded to a syslog server, promtail, loki, ...

Locally I fixed this by adding redirections to /dev/null and /dev/console:
start-stop-daemon --start --quiet --background --make-pidfile --pidfile "$PIDFILE" --no-close \
    --exec "$BIN" -- --config "/etc/bmcd/config.yaml" > /dev/console < /dev/null 2>&1

This circumvents the issue. 

When does this happen? everytime the bmcd dies upon error or when the OOMReaper goes around and you restart bmcd manually.