liske / needrestart

Restart daemons after library updates.
GNU General Public License v2.0
424 stars 67 forks source link

needrestart hangs in apt hook turning into a zombie #241

Open Corsaire0177 opened 2 years ago

Corsaire0177 commented 2 years ago

Problem happens on debian10 (buster) with needrestart 3.4 and 3.5

After doing a apt upgrade you get the following :

[...]
Setting up libirs161:amd64 (1:9.11.5.P4+dfsg-5.1+deb10u7) ...
Setting up bind9-host (1:9.11.5.P4+dfsg-5.1+deb10u7) ...
Setting up dnsutils (1:9.11.5.P4+dfsg-5.1+deb10u7) ...
Processing triggers for libc-bin (2.28-10+deb10u1) ...
Processing triggers for mime-support (3.62) ...
Scanning processes...
Scanning candidates...

Failed to check for processor microcode upgrades.

Restarting services...
 invoke-rc.d cron restart
 invoke-rc.d nagios-nrpe-server restart
 invoke-rc.d nullmailer restart
 invoke-rc.d open-vm-tools restart
 invoke-rc.d openntpd restart
 invoke-rc.d snmpd restart
 invoke-rc.d ssh restart
 invoke-rc.d syslog-ng restart
 invoke-rc.d ulogd2 restart
 invoke-rc.d unbound restart

No containers need to be restarted.

User sessions running outdated binaries:
 root @ /dev/pts/0: bash[10397]

Message from root@sv03-stage on (none) at 18:09 ...

Your session is running obsolete binaries or libraries as listed below.
Please consider a relogin or restart of the affected processes!

     1    bash[10397]

EOF
 root @ /dev/tty1: getty[2053]
 root @ /dev/tty2: getty[2054]
 root @ /dev/tty3: getty[2055]
 root @ /dev/tty4: getty[2056]
 root @ /dev/tty5: getty[2057]
 root @ /dev/tty6: getty[2058]

There it just sits ... you can only do a ctrl+c to get out of it.

Looking at the processes it became a zombie. The faulty process :

  ├─sshd,10395
  │   └─bash,10397
  │       └─apt-get,10865 upgrade -y
  │           └─apt-get,12265 upgrade -y
  │               └─sh,12266 -c test -x /usr/lib/needrestart/apt-pinvoke && /usr/lib/needrestart/apt-pinvoke || true
  │                   └─frontend,12267 -w /usr/share/debconf/frontend /usr/sbin/needrestart
  │                       └─(needrestart,12277)

Killing the parent process "frontend" is also a way to get out of it.

Corsaire0177 commented 2 years ago

Ho great...

Found out that there's the same trouble on some debian9 hosts. I say some because not all servers did the blockage.

# cat /etc/debian_version
9.13
# dpkg -l | grep needrestart
ii  needrestart 2.11-3+deb9u1                     all          check 
which daemons need to be restarted after library upgrades
  ├─sshd,2094
  │   ├─sshd,24499
  │   │   └─sh,24594 -c /usr/bin/python /root/.ansible/tmp/ansible-tmp-1652172426.49-4195-68636559789086/AnsiballZ_apt.py && sleep 0
  │   │       └─python,24595 /root/.ansible/tmp/ansible-tmp-1652172426.49-4195-68636559789086/AnsiballZ_apt.py
  │   │           └─aptitude,24910 -y -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold safe-upgrade
  │   │               ├─aptitude,30264 -y -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold safe-upgrade
  │   │               │   └─sh,30265 -c test -x /usr/lib/needrestart/apt-pinvoke && /usr/lib/needrestart/apt-pinvoke || true
  │   │               │       └─needrestart,30266 /usr/sbin/needrestart
  │   │               │           └─(10-dpkg,30299)
  │   │               └─{aptitude},24914
liske commented 1 year ago

Did you observe this behavior in any normal ssh sessions? Maybe aptitude on ansible does not run non-interactive.