liske / needrestart

Restart daemons after library updates.
GNU General Public License v2.0
427 stars 68 forks source link

Always detecting ElasticSearch in LXC as service to restart #54

Closed ktosiek closed 7 years ago

ktosiek commented 7 years ago

I've just restarted the container, but I'm still getting:

$ sudo needrestart -r l -v
[main] eval /etc/needrestart/needrestart.conf
[main] running in root-mode
[Core] Using UI 'NeedRestart::UI::stdio'...
[main] detected systemd
[Core] #8496 is a NeedRestart::Interp::Java
[Core] #8496 uses obsolete script file(s):
[Core] #8496  /usr/share/elasticsearch/plugins/analysis-icu/icu4j-4.8.1.1.jar
[Core] #8496  /usr/share/elasticsearch/lib/asm-4.1.jar
[Core] #8496  /usr/share/elasticsearch/lib/elasticsearch-1.7.6.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/icedtea-sound.jar
[Core] #8496  /usr/share/elasticsearch/lib/groovy-all-2.4.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-memory-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/spatial4j-0.4.1.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-sandbox-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/plugins/analysis-stempel/elasticsearch-analysis-stempel-2.7.0.jar
[Core] #8496  /usr/share/elasticsearch/plugins/analysis-stempel/lucene-analyzers-stempel-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/log4j-1.2.17.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-spatial-4.10.4.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/jsse.jar
[Core] #8496  /usr/share/elasticsearch/lib/jna-4.1.0.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/sunjce_provider.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-misc-4.10.4.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/localedata.jar
[Core] #8496  /usr/share/elasticsearch/lib/apache-log4j-extras-1.2.17.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-join-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-analyzers-common-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/plugins/analysis-icu/lucene-icu-3.6.1.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-queries-4.10.4.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/sunpkcs11.jar
[Core] #8496  /usr/share/elasticsearch/lib/asm-commons-4.1.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-suggest-4.10.4.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/zipfs.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-queryparser-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/jts-1.13.jar
[Core] #8496  /usr/share/elasticsearch/lib/sigar/sigar-1.6.4.jar
[Core] #8496  /usr/share/elasticsearch/plugins/analysis-icu/elasticsearch-analysis-icu-1.7.0.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/rt.jar
[Core] #8496  /usr/share/elasticsearch/lib/antlr-runtime-3.5.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-highlighter-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-core-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-expressions-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-grouping-4.10.4.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/dnsns.jar
[LXC] #8496 is part of LXC container 'elasticsearch' and should be restarted
[Kernel] Linux: kernel release 4.4.0-53-generic, kernel version #74-Ubuntu SMP Fri Dec 2 15:59:10 UTC 2016
[Kernel/Linux] /boot/vmlinuz-4.4.0-53-generic => 4.4.0-53-generic (buildd@lcy01-28) #74-Ubuntu SMP Fri Dec 2 15:59:10 UTC 2016 [4.4.0-53-generic]*
[Kernel/Linux] /boot/vmlinuz-4.4.0-51-generic => 4.4.0-51-generic (buildd@lcy01-08) #72-Ubuntu SMP Thu Nov 24 18:29:54 UTC 2016 [4.4.0-51-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-47-generic => 4.4.0-47-generic (buildd@lcy01-03) #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 [4.4.0-47-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-45-generic => 4.4.0-45-generic (buildd@lgw01-34) #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 [4.4.0-45-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-43-generic => 4.4.0-43-generic (buildd@lgw01-22) #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 [4.4.0-43-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-42-generic => 4.4.0-42-generic (buildd@lgw01-13) #62-Ubuntu SMP Fri Oct 7 23:11:45 UTC 2016 [4.4.0-42-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-38-generic => 4.4.0-38-generic (buildd@lgw01-58) #57-Ubuntu SMP Tue Sep 6 15:42:33 UTC 2016 [4.4.0-38-generic]
[Kernel/Linux] Expected linux version: 4.4.0-53-generic
Running kernel seems to be up-to-date.
No services need to be restarted.
Containers to be restarted:
 lxc-stop --reboot --name elasticsearch
No user sessions are running outdated binaries.
ktosiek commented 7 years ago

Also checked with current master, it's still a problem:

$ sudo ./needrestart -r l -v
[main] eval /etc/needrestart/needrestart.conf
[main] needrestart v2.9
[main] running in root mode
[Core] Using UI 'NeedRestart::UI::stdio'...
[main] detected systemd
[Core] #8496 is a NeedRestart::Interp::Java
[Core] #8496 uses obsolete script file(s):
[Core] #8496  /usr/share/elasticsearch/lib/log4j-1.2.17.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-highlighter-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-analyzers-common-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/sigar/sigar-1.6.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-queries-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/antlr-runtime-3.5.jar
[Core] #8496  /usr/share/elasticsearch/lib/elasticsearch-1.7.6.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-memory-4.10.4.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/jsse.jar
[Core] #8496  /usr/share/elasticsearch/lib/jna-4.1.0.jar
[Core] #8496  /usr/share/elasticsearch/plugins/analysis-icu/icu4j-4.8.1.1.jar
[Core] #8496  /usr/share/elasticsearch/plugins/analysis-icu/elasticsearch-analysis-icu-1.7.0.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/zipfs.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/sunjce_provider.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-misc-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/plugins/analysis-stempel/lucene-analyzers-stempel-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-sandbox-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-core-4.10.4.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/localedata.jar
[Core] #8496  /usr/share/elasticsearch/lib/jts-1.13.jar
[Core] #8496  /usr/share/elasticsearch/lib/spatial4j-0.4.1.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/rt.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-queryparser-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/asm-commons-4.1.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-suggest-4.10.4.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/sunpkcs11.jar
[Core] #8496  /usr/share/elasticsearch/lib/asm-4.1.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-join-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/plugins/analysis-icu/lucene-icu-3.6.1.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-expressions-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-spatial-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/apache-log4j-extras-1.2.17.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/icedtea-sound.jar
[Core] #8496  /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/dnsns.jar
[Core] #8496  /usr/share/elasticsearch/lib/lucene-grouping-4.10.4.jar
[Core] #8496  /usr/share/elasticsearch/lib/groovy-all-2.4.4.jar
[Core] #8496  /usr/share/elasticsearch/plugins/analysis-stempel/elasticsearch-analysis-stempel-2.7.0.jar
[LXC] #8496 is part of LXC container 'elasticsearch' and should be restarted
[Core] #11374 is a NeedRestart::Interp::Python
[Python] #11374: could not get current working directory, skipping
[Kernel] Linux: kernel release 4.4.0-53-generic, kernel version #74-Ubuntu SMP Fri Dec 2 15:59:10 UTC 2016
[Kernel/Linux] /boot/vmlinuz-4.4.0-53-generic => 4.4.0-53-generic (buildd@lcy01-28) #74-Ubuntu SMP Fri Dec 2 15:59:10 UTC 2016 [4.4.0-53-generic]*
[Kernel/Linux] /boot/vmlinuz-4.4.0-51-generic => 4.4.0-51-generic (buildd@lcy01-08) #72-Ubuntu SMP Thu Nov 24 18:29:54 UTC 2016 [4.4.0-51-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-47-generic => 4.4.0-47-generic (buildd@lcy01-03) #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 [4.4.0-47-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-45-generic => 4.4.0-45-generic (buildd@lgw01-34) #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 [4.4.0-45-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-43-generic => 4.4.0-43-generic (buildd@lgw01-22) #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 [4.4.0-43-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-42-generic => 4.4.0-42-generic (buildd@lgw01-13) #62-Ubuntu SMP Fri Oct 7 23:11:45 UTC 2016 [4.4.0-42-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-38-generic => 4.4.0-38-generic (buildd@lgw01-58) #57-Ubuntu SMP Tue Sep 6 15:42:33 UTC 2016 [4.4.0-38-generic]
[Kernel/Linux] Expected linux version: 4.4.0-53-generic
Running kernel seems to be up-to-date.
No services need to be restarted.
Containers to be restarted:
 lxc-stop --reboot --name elasticsearch
No user sessions are running outdated binaries.
liske commented 7 years ago

Which filesystem is used inside the container (stacked ones)? The interpreter stuff compares the start time of the process 8496 with the modification time of the mapped jar files and detects that the jar files have been modified after launching process 8496 for some reason.

ktosiek commented 7 years ago

I think it's just a bind mount of the host's ext4.

Stat from inside the container:

# stat /usr/share/elasticsearch/lib/jna-4.1.0.jar
  File: ‘/usr/share/elasticsearch/lib/jna-4.1.0.jar’
  Size: 914597      Blocks: 1792       IO Block: 4096   regular file
Device: fc00h/64512d    Inode: 3539649     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2016-12-12 17:57:56.192643281 +0000
Modify: 2016-11-18 15:22:41.000000000 +0000
Change: 2016-11-25 13:49:44.101236875 +0000
 Birth: -

Stat from the outside:

# stat /var/lib/lxc/elasticsearch/rootfs/usr/share/elasticsearch/lib/jna-4.1.0.jar
  File: '/var/lib/lxc/elasticsearch/rootfs/usr/share/elasticsearch/lib/jna-4.1.0.jar'
  Size: 914597      Blocks: 1792       IO Block: 4096   regular file
Device: fc00h/64512d    Inode: 3539649     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2016-12-12 18:57:56.192643281 +0100
Modify: 2016-11-18 16:22:41.000000000 +0100
Change: 2016-11-25 14:49:44.101236875 +0100
 Birth: -
ktosiek commented 7 years ago

Oh my, I have different timezones inside and outside the container. But it's already long past last modification, and I've restarted the container today.

liske commented 7 years ago

Could you please provide /proc/$PID/stat from inside and outside of the container (of the same java process)? The timezone should not matter since epoch timestamps are used during comparing of timestamps.

ktosiek commented 7 years ago

Outside:

$ cat /proc/8496/stat
8496 (java) S 6725 8483 8483 0 -1 1077936128 48547 0 2277 0 11217 5734 0 0 20 0 62 0 6925709 3619454976 51075 18446744073709551615 1 1 0 0 0 0 0 2 16800973 0 0 0 17 1 0 0 26 0 0 0 0 0 0 0 0 0 0

Outside, as root:

$ sudo cat /proc/8496/stat
8496 (java) S 6725 8483 8483 0 -1 1077936128 48559 0 2277 0 11240 5742 0 0 20 0 62 0 6925709 3619454976 51075 18446744073709551615 4194304 4196724 140729051799696 140729051782368 139928006182491 0 0 2 16800973 0 0 0 17 1 0 0 26 0 0 6294960 6295616 7467008 140729051802550 140729051803385 140729051803385 140729051803596 0

Inside:

# cat /proc/1645/stat
1645 (java) S 1 1633 1633 0 -1 1077936128 47889 0 2229 0 11117 5663 0 0 20 0 62 0 6925709 3619454976 50324 18446744073709551615 4194304 4196724 140729051799696 140729051782368 139928006182491 0 0 2 16800973 0 0 0 17 1 0 0 26 0 0 6294960 6295616 7467008 140729051802550 140729051803385 140729051803385 140729051803596 0
liske commented 7 years ago

Hi,

I'm still unable to reproduce this issue. Did you try to run needrestart inside your container? The problem seems not to be triggered by the /proc/$PID/stat entries. Could you please provide the output of this small perl cli call from inside and outside of the container?

$ PID=8496 && perl -MProc::ProcessTable -MData::Dumper -e "\$Data::Dumper::Sortkeys++; \$pt = new Proc::ProcessTable(enable_ttys => 1); print Dumper(grep {\$_->pid == $PID;} @{\$pt->table});"

(The PID assignment needs to be adapted to match the correct process ID inside/outside)

ktosiek commented 7 years ago

Outside:

$VAR1 = bless( {
                 'cmajflt' => '0',
                 'cminflt' => '0',
                 'cmndline' => '/usr/lib/jvm/java-7-openjdk-amd64//bin/java -Xms256m -Xmx1g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Delasticsearch -Des.pidfile=/var/run/elasticsearch/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/elasticsearch-1.7.6.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* -Des.default.config=/etc/elasticsearch/elasticsearch.yml -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/var/lib/elasticsearch -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch',
                 'cstime' => '0',
                 'ctime' => '0',
                 'cutime' => '0',
                 'cwd' => '/',
                 'egid' => 106,
                 'euid' => 103,
                 'exec' => undef,
                 'fgid' => 106,
                 'flags' => '1077936128',
                 'fname' => 'java',
                 'fuid' => 103,
                 'gid' => 106,
                 'majflt' => '91',
                 'minflt' => '22686',
                 'pctcpu' => ' 11.49',
                 'pctmem' => '2.57',
                 'pgrp' => 1767,
                 'pid' => 1803,
                 'ppid' => 32409,
                 'priority' => 20,
                 'rss' => '242675712',
                 'sess' => 1767,
                 'sgid' => 106,
                 'size' => '3617210368',
                 'start' => '1484494544',
                 'state' => 'sleep',
                 'stime' => '750000',
                 'suid' => 103,
                 'time' => '13670000',
                 'ttydev' => '',
                 'ttynum' => 0,
                 'uid' => 103,
                 'utime' => '12920000',
                 'wchan' => '0'
               }, 'Proc::ProcessTable::Process' );

Inside (I've had to install libproc-processtable-perl for it to run):

$VAR1 = bless( {
                 'cmajflt' => '0',
                 'cminflt' => '0',
                 'cmndline' => '/usr/lib/jvm/java-7-openjdk-amd64//bin/java -Xms256m -Xmx1g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Delasticsearch -Des.pidfile=/var/run/elasticsearch/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/elasticsearch-1.7.6.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* -Des.default.config=/etc/elasticsearch/elasticsearch.yml -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/var/lib/elasticsearch -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch',
                 'cstime' => '0',
                 'ctime' => '0',
                 'cutime' => '0',
                 'cwd' => '/',
                 'egid' => 106,
                 'euid' => 103,
                 'exec' => '/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java',
                 'fgid' => 106,
                 'flags' => '1077936128',
                 'fname' => 'java',
                 'fuid' => 103,
                 'gid' => 106,
                 'majflt' => '91',
                 'minflt' => '23167',
                 'pctcpu' => '  5.41',
                 'pctmem' => '2.59',
                 'pgrp' => 1630,
                 'pid' => 1664,
                 'ppid' => 1,
                 'priority' => 20,
                 'rss' => '244273152',
                 'sess' => 1630,
                 'sgid' => 106,
                 'size' => '3617210368',
                 'start' => '1484494544',
                 'state' => 'sleep',
                 'stime' => '980000',
                 'suid' => 103,
                 'time' => '15410000',
                 'ttydev' => '',
                 'ttynum' => 0,
                 'uid' => 103,
                 'utime' => '14430000',
                 'wchan' => '0'
               }, 'Proc::ProcessTable::Process' );
liske commented 7 years ago

Thanks for your continous feedback. I thought it was a bug somewhere in the interpretation of process creation time and the java class file timestamps. I was wrong, the bug was how needrestart did access interpreter files. It does ignore /proc/$PID/root completely and if the elasticsearch class files are not present outside the container needrestart will always suggest to restart the container due to elasticsearch.

ktosiek commented 7 years ago

I'm happy to provide as much feedback as needed, I'm the one with a problem ;-)

I've just restarted the problematic container (just to be sure), did a git pull, and ./needrestart -r l -v 2>&1 | less still shows the elasticsearch container as needing restart because of "obsolete script files".

liske commented 7 years ago

Only the new needrestart script will be used if you pull git HEAD and try to run it. The interpreter stuff is within dedicated perl modules... and they still will be loaded from /usr/share/perl5/NeedRestart/. I've attached a rebuild Debian package needrestart_2.11-0pre1_all.deb.zip, maybe you could give it a try (sorry, gibhub requires me to attach it as a zip file)?

ktosiek commented 7 years ago

Thank you, it works! I've just checked with the 0pre1 package, and it did not report the elasticsearch container as needing restart until I've touched one of the jars ^_^

geor-g commented 7 years ago

@liske I'm running into this same bug in multiple KVM VM. I'm running Debian jessie currently, any chance of getting this fix into j-bp? Soon? This would be really really great!

geor-g commented 7 years ago

@liske Hm..I'm running 2.11-2~bpo8+1 already, so this fix should be inside there, right? Should I open another ticket showing more details?

liske commented 7 years ago

Please open another issue include verbose logs if the problem still exists.