Closed iicky closed 4 years ago
Thanks for the bug report - I'll try to reproduce it with my Ubuntu NUC at home. In theory the processes should be hard killed by NodeJS, that was also what I was observing when I tried it. I didn't try it with this specific setup though, so maybe we need some small adaptions in the Dockerfile.
@mKeRix Awesome, thanks! Let me know if you need any additional info.
I have the same issue. Docker logs
[Nest] 1 - 04/03/2020, 5:00:53 PM [BluetoothClassicService] Query of XX:XX:XX:XX:XX:XX took too long, resetting hci0
[Nest] 1 - 04/03/2020, 5:01:23 PM [BluetoothClassicService] Query of XX:XX:XX:XX:XX:XX took too long, resetting hci0
[Nest] 1 - 04/03/2020, 5:01:29 PM [BluetoothClassicService] Query of XX:XX:XX:XX:XX:XX took too long, resetting hci0
[Nest] 1 - 04/03/2020, 5:01:35 PM [BluetoothClassicService] Query of XX:XX:XX:XX:XX:XX took too long, resetting hci0
Zombie processes
root 3779 0.0 0.0 0 0 ? Z 19:00 0:00 [hcitool] <defunct>
root 3782 0.0 0.0 0 0 ? Z 19:00 0:00 [hcitool] <defunct>
root 3799 0.0 0.0 0 0 ? Z 19:01 0:00 [hcitool] <defunct>
root 3802 0.0 0.0 0 0 ? Z 19:01 0:00 [hcitool] <defunct>
root 3805 0.0 0.0 0 0 ? Z 19:01 0:00 [hcitool] <defunct>
Environment
room-assistant version: 2.2.0
installation type: Docker
hardware: ProxmoxVE
OS: Debian 4.19.98-1 (2020-01-26) x86_64 GNU/Linux
Config Docker
version: '3'
services:
room-assistant:
container_name: room-assistant
image: mkerix/room-assistant
restart: unless-stopped
network_mode: host
cap_add:
- NET_ADMIN
volumes:
- /var/run/dbus:/var/run/dbus
- /etc/room-assistant/config:/room-assistant/config
Room asistant
global:
instanceName: Home
integrations:
- homeAssistant
- bluetoothClassic
homeAssistant:
mqttUrl: 'mqtt://XXX.XXX.XXX.XXX:1883'
mqttOptions:
username: <user>
password: <pass>
bluetoothClassic:
addresses:
- 'XX:XX:XX:XX:XX:XX'
I just tested today using a USB Bluetooth dongle on the NUC and changing the device to hci1
. The result was even more zombie processes, so I don't think it is the Bluetooth device.
Agreed. Also the fact, that we all use different dongles/built in chips should theoretically rule out hardware issue per se.
I also think it's unlikely that this is related to hardware. My current idea is that the hcitool on Alpine Linux (what is used for the Docker images) doesn't handle SIGKILL correctly (or NodeJS doesn't send the signal correctly). As Alpine Linux is rarely used outside of Docker that would explain why we are only seeing the issues there.
I tried out a Debian image using the following Dockerfile
and I am still getting the same zombie processes. I did my best to find matching or similar packages so I'm not 100% sure the image is a complete Debian replacement, but I can confirm that the zombie processes are still appearing with Debian.
FROM node:12-slim as build
ARG ROOM_ASSISTANT_VERSION=latest
RUN apt-get update && apt-get upgrade -y
RUN apt-get install -y python g++ make libusb-dev avahi-utils libavahi-compat-libdnssd-dev
RUN npm install -g --unsafe-perm room-assistant@$ROOM_ASSISTANT_VERSION
FROM node:12-slim
WORKDIR /room-assistant
RUN apt-get update && apt-get install -y bluez libusb-dev avahi-utils dmidecode libavahi-compat-libdnssd1
RUN ln -s /usr/local/lib/node_modules/room-assistant/bin/room-assistant.js /usr/local/bin/room-assistant
COPY --from=build /usr/local/lib/node_modules/room-assistant /usr/local/lib/node_modules/room-assistant
ENTRYPOINT ["room-assistant"]
CMD ["--digResolver"]
Thanks for checking that already @iicky - saved me some work. I reproduced the issue on a Raspi 3 with Docker today and found a fix. Expect it to be released sometime later today.
The issue arose due to the way Docker manages processes, or rather that NodeJS wasn't made to be PID 1. There is some more information in this article.
:tada: This issue has been resolved in version 2.4.0 :tada:
The release is available on:
Your semantic-release bot :package::rocket:
Perfect - I can confirm that there are no more zombie processes after the update. Thanks so much!
Same here, works like a charm, fantastic work! Thanks for the fix and the link to give us a bit more background.
Describe the bug I am getting a lot of zombie processes popping up as a result of my room-assistant Docker container. It appears as though whenever the BluetoothClassicService query takes too long and
hci0
is reset, it becomes a zombie process. This results in tons of zombie processes over time until I restart the room-assistant container, at which point they all are destroyed.This issue is specifically impacting my deployment on my Intel NUC and is not occurring on my Raspberry Pi deploys.
To reproduce Deploy using the docker-compose file and config below.
Relevant logs
Room Assistant logs.
Zombie processes.
Relevant configuration
Docker Compose
docker-compose.yaml
Room Assistant config
local.yml
Expected behavior I expect the zombie processes to not appear.
Environment
Additional context
Zombie processes are not appearing on both Raspberry Pis that have room-assistant installed.