Closed ktosiek closed 7 years ago
Also checked with current master, it's still a problem:
$ sudo ./needrestart -r l -v
[main] eval /etc/needrestart/needrestart.conf
[main] needrestart v2.9
[main] running in root mode
[Core] Using UI 'NeedRestart::UI::stdio'...
[main] detected systemd
[Core] #8496 is a NeedRestart::Interp::Java
[Core] #8496 uses obsolete script file(s):
[Core] #8496 /usr/share/elasticsearch/lib/log4j-1.2.17.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-highlighter-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-analyzers-common-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/lib/sigar/sigar-1.6.4.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-queries-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/lib/antlr-runtime-3.5.jar
[Core] #8496 /usr/share/elasticsearch/lib/elasticsearch-1.7.6.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-memory-4.10.4.jar
[Core] #8496 /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/jsse.jar
[Core] #8496 /usr/share/elasticsearch/lib/jna-4.1.0.jar
[Core] #8496 /usr/share/elasticsearch/plugins/analysis-icu/icu4j-4.8.1.1.jar
[Core] #8496 /usr/share/elasticsearch/plugins/analysis-icu/elasticsearch-analysis-icu-1.7.0.jar
[Core] #8496 /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/zipfs.jar
[Core] #8496 /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/sunjce_provider.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-misc-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/plugins/analysis-stempel/lucene-analyzers-stempel-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-sandbox-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-core-4.10.4.jar
[Core] #8496 /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/localedata.jar
[Core] #8496 /usr/share/elasticsearch/lib/jts-1.13.jar
[Core] #8496 /usr/share/elasticsearch/lib/spatial4j-0.4.1.jar
[Core] #8496 /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/rt.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-queryparser-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/lib/asm-commons-4.1.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-suggest-4.10.4.jar
[Core] #8496 /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/sunpkcs11.jar
[Core] #8496 /usr/share/elasticsearch/lib/asm-4.1.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-join-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/plugins/analysis-icu/lucene-icu-3.6.1.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-expressions-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-spatial-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/lib/apache-log4j-extras-1.2.17.jar
[Core] #8496 /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/icedtea-sound.jar
[Core] #8496 /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/dnsns.jar
[Core] #8496 /usr/share/elasticsearch/lib/lucene-grouping-4.10.4.jar
[Core] #8496 /usr/share/elasticsearch/lib/groovy-all-2.4.4.jar
[Core] #8496 /usr/share/elasticsearch/plugins/analysis-stempel/elasticsearch-analysis-stempel-2.7.0.jar
[LXC] #8496 is part of LXC container 'elasticsearch' and should be restarted
[Core] #11374 is a NeedRestart::Interp::Python
[Python] #11374: could not get current working directory, skipping
[Kernel] Linux: kernel release 4.4.0-53-generic, kernel version #74-Ubuntu SMP Fri Dec 2 15:59:10 UTC 2016
[Kernel/Linux] /boot/vmlinuz-4.4.0-53-generic => 4.4.0-53-generic (buildd@lcy01-28) #74-Ubuntu SMP Fri Dec 2 15:59:10 UTC 2016 [4.4.0-53-generic]*
[Kernel/Linux] /boot/vmlinuz-4.4.0-51-generic => 4.4.0-51-generic (buildd@lcy01-08) #72-Ubuntu SMP Thu Nov 24 18:29:54 UTC 2016 [4.4.0-51-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-47-generic => 4.4.0-47-generic (buildd@lcy01-03) #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 [4.4.0-47-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-45-generic => 4.4.0-45-generic (buildd@lgw01-34) #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 [4.4.0-45-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-43-generic => 4.4.0-43-generic (buildd@lgw01-22) #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 [4.4.0-43-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-42-generic => 4.4.0-42-generic (buildd@lgw01-13) #62-Ubuntu SMP Fri Oct 7 23:11:45 UTC 2016 [4.4.0-42-generic]
[Kernel/Linux] /boot/vmlinuz-4.4.0-38-generic => 4.4.0-38-generic (buildd@lgw01-58) #57-Ubuntu SMP Tue Sep 6 15:42:33 UTC 2016 [4.4.0-38-generic]
[Kernel/Linux] Expected linux version: 4.4.0-53-generic
Running kernel seems to be up-to-date.
No services need to be restarted.
Containers to be restarted:
lxc-stop --reboot --name elasticsearch
No user sessions are running outdated binaries.
Which filesystem is used inside the container (stacked ones)? The interpreter stuff compares the start time of the process 8496 with the modification time of the mapped jar files and detects that the jar files have been modified after launching process 8496 for some reason.
I think it's just a bind mount of the host's ext4.
Stat from inside the container:
# stat /usr/share/elasticsearch/lib/jna-4.1.0.jar
File: ‘/usr/share/elasticsearch/lib/jna-4.1.0.jar’
Size: 914597 Blocks: 1792 IO Block: 4096 regular file
Device: fc00h/64512d Inode: 3539649 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2016-12-12 17:57:56.192643281 +0000
Modify: 2016-11-18 15:22:41.000000000 +0000
Change: 2016-11-25 13:49:44.101236875 +0000
Birth: -
Stat from the outside:
# stat /var/lib/lxc/elasticsearch/rootfs/usr/share/elasticsearch/lib/jna-4.1.0.jar
File: '/var/lib/lxc/elasticsearch/rootfs/usr/share/elasticsearch/lib/jna-4.1.0.jar'
Size: 914597 Blocks: 1792 IO Block: 4096 regular file
Device: fc00h/64512d Inode: 3539649 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2016-12-12 18:57:56.192643281 +0100
Modify: 2016-11-18 16:22:41.000000000 +0100
Change: 2016-11-25 14:49:44.101236875 +0100
Birth: -
Oh my, I have different timezones inside and outside the container. But it's already long past last modification, and I've restarted the container today.
Could you please provide /proc/$PID/stat
from inside and outside of the container (of the same java process)? The timezone should not matter since epoch timestamps are used during comparing of timestamps.
Outside:
$ cat /proc/8496/stat
8496 (java) S 6725 8483 8483 0 -1 1077936128 48547 0 2277 0 11217 5734 0 0 20 0 62 0 6925709 3619454976 51075 18446744073709551615 1 1 0 0 0 0 0 2 16800973 0 0 0 17 1 0 0 26 0 0 0 0 0 0 0 0 0 0
Outside, as root:
$ sudo cat /proc/8496/stat
8496 (java) S 6725 8483 8483 0 -1 1077936128 48559 0 2277 0 11240 5742 0 0 20 0 62 0 6925709 3619454976 51075 18446744073709551615 4194304 4196724 140729051799696 140729051782368 139928006182491 0 0 2 16800973 0 0 0 17 1 0 0 26 0 0 6294960 6295616 7467008 140729051802550 140729051803385 140729051803385 140729051803596 0
Inside:
# cat /proc/1645/stat
1645 (java) S 1 1633 1633 0 -1 1077936128 47889 0 2229 0 11117 5663 0 0 20 0 62 0 6925709 3619454976 50324 18446744073709551615 4194304 4196724 140729051799696 140729051782368 139928006182491 0 0 2 16800973 0 0 0 17 1 0 0 26 0 0 6294960 6295616 7467008 140729051802550 140729051803385 140729051803385 140729051803596 0
Hi,
I'm still unable to reproduce this issue. Did you try to run needrestart inside your container? The problem seems not to be triggered by the /proc/$PID/stat entries. Could you please provide the output of this small perl cli call from inside and outside of the container?
$ PID=8496 && perl -MProc::ProcessTable -MData::Dumper -e "\$Data::Dumper::Sortkeys++; \$pt = new Proc::ProcessTable(enable_ttys => 1); print Dumper(grep {\$_->pid == $PID;} @{\$pt->table});"
(The PID assignment needs to be adapted to match the correct process ID inside/outside)
Outside:
$VAR1 = bless( {
'cmajflt' => '0',
'cminflt' => '0',
'cmndline' => '/usr/lib/jvm/java-7-openjdk-amd64//bin/java -Xms256m -Xmx1g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Delasticsearch -Des.pidfile=/var/run/elasticsearch/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/elasticsearch-1.7.6.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* -Des.default.config=/etc/elasticsearch/elasticsearch.yml -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/var/lib/elasticsearch -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch',
'cstime' => '0',
'ctime' => '0',
'cutime' => '0',
'cwd' => '/',
'egid' => 106,
'euid' => 103,
'exec' => undef,
'fgid' => 106,
'flags' => '1077936128',
'fname' => 'java',
'fuid' => 103,
'gid' => 106,
'majflt' => '91',
'minflt' => '22686',
'pctcpu' => ' 11.49',
'pctmem' => '2.57',
'pgrp' => 1767,
'pid' => 1803,
'ppid' => 32409,
'priority' => 20,
'rss' => '242675712',
'sess' => 1767,
'sgid' => 106,
'size' => '3617210368',
'start' => '1484494544',
'state' => 'sleep',
'stime' => '750000',
'suid' => 103,
'time' => '13670000',
'ttydev' => '',
'ttynum' => 0,
'uid' => 103,
'utime' => '12920000',
'wchan' => '0'
}, 'Proc::ProcessTable::Process' );
Inside (I've had to install libproc-processtable-perl for it to run):
$VAR1 = bless( {
'cmajflt' => '0',
'cminflt' => '0',
'cmndline' => '/usr/lib/jvm/java-7-openjdk-amd64//bin/java -Xms256m -Xmx1g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Delasticsearch -Des.pidfile=/var/run/elasticsearch/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/elasticsearch-1.7.6.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* -Des.default.config=/etc/elasticsearch/elasticsearch.yml -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/var/lib/elasticsearch -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch',
'cstime' => '0',
'ctime' => '0',
'cutime' => '0',
'cwd' => '/',
'egid' => 106,
'euid' => 103,
'exec' => '/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java',
'fgid' => 106,
'flags' => '1077936128',
'fname' => 'java',
'fuid' => 103,
'gid' => 106,
'majflt' => '91',
'minflt' => '23167',
'pctcpu' => ' 5.41',
'pctmem' => '2.59',
'pgrp' => 1630,
'pid' => 1664,
'ppid' => 1,
'priority' => 20,
'rss' => '244273152',
'sess' => 1630,
'sgid' => 106,
'size' => '3617210368',
'start' => '1484494544',
'state' => 'sleep',
'stime' => '980000',
'suid' => 103,
'time' => '15410000',
'ttydev' => '',
'ttynum' => 0,
'uid' => 103,
'utime' => '14430000',
'wchan' => '0'
}, 'Proc::ProcessTable::Process' );
Thanks for your continous feedback. I thought it was a bug somewhere in the interpretation of process creation time and the java class file timestamps. I was wrong, the bug was how needrestart did access interpreter files. It does ignore /proc/$PID/root completely and if the elasticsearch class files are not present outside the container needrestart will always suggest to restart the container due to elasticsearch.
I'm happy to provide as much feedback as needed, I'm the one with a problem ;-)
I've just restarted the problematic container (just to be sure), did a git pull
, and ./needrestart -r l -v 2>&1 | less
still shows the elasticsearch container as needing restart because of "obsolete script files".
Only the new needrestart
script will be used if you pull git HEAD and try to run it. The interpreter stuff is within dedicated perl modules... and they still will be loaded from /usr/share/perl5/NeedRestart/
. I've attached a rebuild Debian package needrestart_2.11-0pre1_all.deb.zip, maybe you could give it a try (sorry, gibhub requires me to attach it as a zip file)?
Thank you, it works! I've just checked with the 0pre1 package, and it did not report the elasticsearch container as needing restart until I've touched one of the jars ^_^
@liske I'm running into this same bug in multiple KVM VM. I'm running Debian jessie currently, any chance of getting this fix into j-bp? Soon? This would be really really great!
@liske Hm..I'm running 2.11-2~bpo8+1
already, so this fix should be inside there, right? Should I open another ticket showing more details?
Please open another issue include verbose logs if the problem still exists.
I've just restarted the container, but I'm still getting: