monitoringartist / dockbix-agent-xxl

:whale: Dockerized Zabbix agent with Docker metrics and host metrics support for CoreOS, RHEL, CentOS, Ubuntu, Debian, Fedora, Boot2docker, Photon OS, Amazon Linux, ...
https://hub.docker.com/r/monitoringartist/dockbix-agent-xxl-limited/
Other
182 stars 54 forks source link

Open file handles prevent containers from being removed #28

Closed chibacityblues closed 6 years ago

chibacityblues commented 7 years ago

I'm using dockbix-agent-xxl-limited on one of our CI systems for evaluation. While working nicely, I encountered an issue when containers are removed (either by hand or by CI):

Removing my_app_staging_db ... error

ERROR: for my_app_staging_db  Unable to remove filesystem for 4170b684a2bd193d6e5f73717d20fd4266afcf04ba0c5737d6649bc1e3365107: remove /var/lib/docker/containers/4170b684a2bd193d6e5f73717d20fd4266afcf04ba0c5737d6649bc1e3365107/shm: device or resource busy

Digging deeper revealed dockbix-agent-xxl as the blocking process:

$ grep -l 4170b684a2 /proc/*/mountinfo
/proc/14571/mountinfo
$ ps -f 14571
UID        PID  PPID  C STIME TTY      STAT   TIME CMD
root     14571 14556  0 Mai02 ?        Ssl    0:00 /dockbix-agent-xxl

After stopping the agent-container, removing other containers works again.

Is there something I can do about this?

jangaraj commented 7 years ago

Thx for report. You can't improve it. Let me see what I can do in the code.

samkrew commented 7 years ago

Hello! You can't fix it. cAdvisor have got same issue with AUFS. Everyone should use OverlayFS to avoid this issue.

jangaraj commented 7 years ago

Yes, I follow https://github.com/google/cadvisor/issues/771. It looks like a problem of used Docker golang library. monitoringartist/dockbix-agent-xxl-limited uses that library to collect stats data. Unfortunately used library doesn't support disconnect function. In the theory paid image monitoringartist/dockbix-agent-xxl doesn't have this issue, because it doesn't initialize Docker golang library.

OverlayFS can be a workaround, but you may find another type of issues related to overlayfs https://github.com/monitoringartist/dockbix-agent-xxl/issues/16

jangaraj commented 6 years ago

I've added signal forwarding (3.4-3+ version), so any SIGTERM signal(s) from the docker daemon will also be forwarded to the zabbix_agentd. It may help with this issue.