Closed MaintainUplift closed 1 year ago
[snip] However, since we are compiling from source and we are using the latest perl 5.34.0 directly from CPAN, AND we are still getting this bug, I have tried a number of things to fix it, and they have failed. It seems that there's a difference between the Time::HiRes packaged with perl 5.34.0 (1.9764) and the newest version on CPAN (1.9767), yet still this apparently does not address this Time::HiRes issue when compiling from src on RHEL 8.5.
Point of clarification: I think you may have the above versions backward.
1.9764 is the most recent CPAN release (Aug 10 2020):
1.9767 was bundled with perl-5.34.0:
$ cd perl-5.34.0
$ ack 'our \$VERSION' dist/Time-HiRes/HiRes.pm
our $VERSION = '1.9767';
Also, note that any changes we would make to address your concerns would be initiated in Perl 5 blead, where we are currently up to 1.9769.
$ grep -n -E 'our \$VERSION' dist/Time-HiRes/HiRes.pm
53:our $VERSION = '1.9769';
@MaintainUplift, the problematic file, dist/Time-HiRes/t/stat.t
, currently lacks descriptions (a.k.a. labels) for its unit tests, which IMO impedes debugging. I have created pull request https://github.com/Perl/perl5/pull/19322 to provide those descriptions.
Would you be able to configure and build perl
on the architecture, and with the RHEL patches, you are currently using, and paste the results?
Thank you very much. Jim Keenan
Here's what happened in the Configure and Build stages:
[edited 2022-02-10 hvds to collapse long output]
Here's what happened in the Configure and Build stages:
I should have been more specific. All we really need at this point is the output from:
$ uname -mrs
$ ./perl -Ilib -V:config_args
$ ./perl -Ilib -Idist/Time-HiRes/ dist/Time-HiRes/t/stat.t
... which can be issued from the top-level directory of your git checkout.
Here are the results:
[brass@brass-eastus2-poc-app-vmss-01000004 perl-5.34.0]$ uname -mrs
Linux 4.18.0-348.7.1.el8_5.x86_64 x86_64
[brass@brass-eastus2-poc-app-vmss-01000004 perl-5.34.0]$ ./perl -Ilib -V:config_args
config_args='-des -Dprefix=/opt/app/brass/brassperl';
brass@brass-eastus2-poc-app-vmss-01000004 perl-5.34.0]$ ./perl -Ilib -Idist/Time-HiRes/ dist/Time-HiRes/t/stat.t
1..43
# I am the main process 354288, starting the watchdog process...
# The watchdog process 354289 launched, continuing testing...
# I am the watchdog process 354289, sleeping for 360 seconds...
ok 1
ok 2
ok 3
ok 4
ok 5
not ok 6
# Failed test at dist/Time-HiRes/t/stat.t line 35.
# Structures begin differing at:
# $got->[8] = '1641302120.89724'
# $expected->[8] = '1641302120.89624'
ok 7
ok 8
ok 9
ok 10
ok 11
ok 12
not ok 13
# Failed test at dist/Time-HiRes/t/stat.t line 35.
# Structures begin differing at:
# $got->[8] = '1641302121.16524'
# $expected->[8] = '1641302120.89724'
ok 14
ok 15
ok 16
ok 17
ok 18
ok 19
not ok 20
# Failed test at dist/Time-HiRes/t/stat.t line 35.
# Structures begin differing at:
# $got->[8] = '1641302121.50125'
# $expected->[8] = '1641302121.35724'
ok 21
ok 22
ok 23
ok 24
ok 25
ok 26
not ok 27
# Failed test at dist/Time-HiRes/t/stat.t line 35.
# Structures begin differing at:
# $got->[8] = '1641302121.72425'
# $expected->[8] = '1641302121.61625'
ok 28
ok 29
ok 30
ok 31
ok 32
ok 33
not ok 34
# Failed test at dist/Time-HiRes/t/stat.t line 35.
# Structures begin differing at:
# $got->[8] = '1641302122.03725'
# $expected->[8] = '1641302121.85825'
ok 35
# mtime = 1641302120.89624 1641302121.16524 1641302121.50125 1641302121.72425 1641302122.03625
# atime = 1641302120.89724 1641302121.35724 1641302121.61625 1641302121.85825 1641302122.03725
# ai = 4, mi = 4, ss = 8
ok 36
ok 37
ok 38
ok 39
ok 40
ok 41
ok 42
ok 43
# I am the main process 354288, terminating the watchdog process 354289 before it terminates me in 358 seconds (testing took 2 seconds).
# kill KILL 354289 = 1
# All done.
# Looks like you failed 5 tests of 43.
Thanks extremely much for your kind assistance.
[Edited 2022-02-10 hvds to fix formatting]
@MaintainUplift, two updates:
First, to facilitate debugging we have just updated dist/Time-HiRes/t/stat.t
and associated files in Perl 5 blead to provide labels for all tests. So it would be helpful if, in your git checkout, you could (i) git pull origin blead
; (ii) apply your regular platform patches, rerun Configure
and make test_prep
; (iii) then re-run the failing test.
Second, I've been trying to wrap my head around what stat.t
is actually doing. I think this is an accurate description:
The program goes through 5 cycles, in each of which it:
... further tests not relevant to this issue.
P5P: Is this an accurate description of what dist/Time-HiRes/t/stat.t
is doing?
However, when I start to add debugging code to stat.t
, I start to notice some unusual(?) things. (See https://github.com/jkeenan/perl5/tree/time-hires-t-stat-20220103-preliminary-debug.)
git diff -w blead..HEAD |cat
diff --git a/dist/Time-HiRes/t/stat.t b/dist/Time-HiRes/t/stat.t
index 1f1fa963d8..d274940f0e 100644
--- a/dist/Time-HiRes/t/stat.t
+++ b/dist/Time-HiRes/t/stat.t
@@ -25,6 +25,7 @@ for (1..5) {
open(X, '>', $$);
print X $$;
close(X);
+ note "Created file $$";
my($a, $stat, $b) = ("a", [Time::HiRes::stat($$)], "b");
is $a, "a", "stat stack discipline";
is $b, "b", "stat stack discipline";
@@ -38,6 +39,8 @@ for (1..5) {
skip "testing stat access time on Haiku", 2;
}
is_deeply $lstat, $stat, "write: stat and lstat returned same values";
+ note("stat: atime after write: $stat->[8]");
+ note("lstat: atime after write: $lstat->[8]");
Time::HiRes::sleep(rand(0.1) + 0.1);
open(X, '<', $$);
<X>;
@@ -47,6 +50,7 @@ for (1..5) {
$lstat = [Time::HiRes::lstat($$)];
is_deeply $lstat, $stat, "read: stat and lstat returned same values";
}
+ #unlink $$;
}
1 while unlink $$;
note ("mtime = @mtime");
... and I now re-run the test program, I get:
$ ./perl -Ilib -Idist/Time-HiRes dist/Time-HiRes/t/stat.t
1..43
# I am the main process 61428, starting the watchdog process...
# The watchdog process 61437 launched, continuing testing...
# I am the watchdog process 61437, sleeping for 360 seconds...
# cycle 1
# Created file 61428
ok 1 - stat stack discipline
ok 2 - stat stack discipline
ok 3 - stat returned array
ok 4 - lstat stack discipline
ok 5 - lstat stack discipline
ok 6 - write: stat and lstat returned same values
# stat: atime after write: 1641315605.38869
# lstat: atime after write: 1641315605.38869
ok 7 - read: stat and lstat returned same values
# cycle 2
# Created file 61428
ok 8 - stat stack discipline
ok 9 - stat stack discipline
ok 10 - stat returned array
ok 11 - lstat stack discipline
ok 12 - lstat stack discipline
ok 13 - write: stat and lstat returned same values
# stat: atime after write: 1641315605.38869
# lstat: atime after write: 1641315605.38869
ok 14 - read: stat and lstat returned same values
# cycle 3
# Created file 61428
ok 15 - stat stack discipline
ok 16 - stat stack discipline
ok 17 - stat returned array
ok 18 - lstat stack discipline
ok 19 - lstat stack discipline
ok 20 - write: stat and lstat returned same values
# stat: atime after write: 1641315605.38869
# lstat: atime after write: 1641315605.38869
ok 21 - read: stat and lstat returned same values
# cycle 4
# Created file 61428
ok 22 - stat stack discipline
ok 23 - stat stack discipline
ok 24 - stat returned array
ok 25 - lstat stack discipline
ok 26 - lstat stack discipline
ok 27 - write: stat and lstat returned same values
# stat: atime after write: 1641315605.38869
# lstat: atime after write: 1641315605.38869
ok 28 - read: stat and lstat returned same values
# cycle 5
# Created file 61428
ok 29 - stat stack discipline
ok 30 - stat stack discipline
ok 31 - stat returned array
ok 32 - lstat stack discipline
ok 33 - lstat stack discipline
ok 34 - write: stat and lstat returned same values
# stat: atime after write: 1641315605.38869
# lstat: atime after write: 1641315605.38869
ok 35 - read: stat and lstat returned same values
...
Note that the atime
reported by the Time::HiRes::stat
and ::lstat
calls don't change from one cycle to the next. Is this because (i) the file we're writing to in each cycle bears the same filename (the value of the PID); and/or (ii) we're only unlink
-ing the temp file at the end of all cycles?
My investigation suggests that if I unlink
the tempfile at the end of each cycle, that atime
values reported advance chronologically as I would expect.
But I don't know whether this is related to the OP's problem or not.
Ok, as instructed, I pulled a blead copy to local winbox then zipped it and sent it to the server. One note, before I could start I had to run dos2unix
on the entire folder. Then, during the make
, this happened:
Use of uninitialized value $check in hash element at ./regen/embed_lib.pl line 118,
Ok, as instructed, I pulled a blead copy to local winbox then zipped it and sent it to the server. One note, before I could start I had to run
dos2unix
on the entire folder. Then, during themake
, this happened:Use of uninitialized value $check in hash element at ./regen/embed_lib.pl line 118, line 3. Use of uninitialized value $check in hash element at ./regen/embed_lib.pl line 118, line 8.
[snip]
@MaintainUplift, I have to admit that I think you're introducing complications into this process. Your Subject line and initial post clearly indicate you're trying to diagnose a problem you experience on Linux. Why than are you introducing a Windows box to the diagnostic process?
I am not actually "introducing" a windows box here. The ONLY way I can get the blead so that I can then load it to the linux system is via these means. I cannot pull it directly to the target system because it's firewalled all to heck (and git is not and will not be installed on it anyway). I did not touch any of the code, so I was QUITE surprised to find that I had to run dos2unix on it. All I did was pull it, zip, and sftp it to the box and unzip it. If there's some other Special Linux Only pull process, I am eager to learn it.
Many thanks for your continued assistance.
Update:
Yep, I'm ignorant. But ignorant NO LONGER - and here's a tip for everyone else:
I was not aware that using 7zip to zip a collection of texts would introduce DUMB THINGS like changing the line endings to something other than what it came with.
SO: DO NOT USE 7zip TO ZIP ANYTHING YOU ARE SENDING TO A NIX ENVIRONMENT!
By the way: Thanks to your extremely fine support efforts the issue has been resolved:
[brass@brass-eastus2-poc-app-vmss-01000004 perl-5.34.0]$ ./perl -Ilib -Idist/Time-HiRes/ dist/Time-HiRes/t/stat.t
1..43
# I am the main process 992333, starting the watchdog process...
# The watchdog process 992334 launched, continuing testing...
# I am the watchdog process 992334, sleeping for 360 seconds...
# cycle 1
# Created file 992333
ok 1 - stat stack discipline
ok 2 - stat stack discipline
ok 3 - stat returned array
ok 4 - lstat stack discipline
ok 5 - lstat stack discipline
ok 6 - write: stat and lstat returned same values
# stat: atime after write: 1641408569.45209
# lstat: atime after write: 1641408569.45209
ok 7 - read: stat and lstat returned same values
# cycle 2
# Created file 992333
ok 8 - stat stack discipline
ok 9 - stat stack discipline
ok 10 - stat returned array
ok 11 - lstat stack discipline
ok 12 - lstat stack discipline
ok 13 - write: stat and lstat returned same values
# stat: atime after write: 1641408569.75709
# lstat: atime after write: 1641408569.75709
ok 14 - read: stat and lstat returned same values
# cycle 3
# Created file 992333
ok 15 - stat stack discipline
ok 16 - stat stack discipline
ok 17 - stat returned array
ok 18 - lstat stack discipline
ok 19 - lstat stack discipline
ok 20 - write: stat and lstat returned same values
# stat: atime after write: 1641408570.0661
# lstat: atime after write: 1641408570.0661
ok 21 - read: stat and lstat returned same values
# cycle 4
# Created file 992333
ok 22 - stat stack discipline
ok 23 - stat stack discipline
ok 24 - stat returned array
ok 25 - lstat stack discipline
ok 26 - lstat stack discipline
ok 27 - write: stat and lstat returned same values
# stat: atime after write: 1641408570.4411
# lstat: atime after write: 1641408570.4411
ok 28 - read: stat and lstat returned same values
# cycle 5
# Created file 992333
ok 29 - stat stack discipline
ok 30 - stat stack discipline
ok 31 - stat returned array
ok 32 - lstat stack discipline
ok 33 - lstat stack discipline
ok 34 - write: stat and lstat returned same values
# stat: atime after write: 1641408570.77811
# lstat: atime after write: 1641408570.77811
ok 35 - read: stat and lstat returned same values
# mtime = 1641408569.45209 1641408569.75709 1641408570.0661 1641408570.4411 1641408570.77811
# atime = 1641408569.60209 1641408569.8861 1641408570.2581 1641408570.59311 1641408570.88611
# ai = 4, mi = 4, ss = 8
ok 36 - monotonical increase and subsecond results within expected parameters
# compare Time::HiRes::stat with ::lstat
ok 37 - stat on target
ok 38 - lstat on target
ok 39 - stat on link
ok 40 - lstat on link
ok 41 - stat and lstat return same values on target
ok 42 - stat and lstat return same values on link
ok 43 - target stat mode value differs from link lstat mode value
# I am the main process 992333, terminating the watchdog process 992334 before it terminates me in 359 seconds (testing took 1 seconds).
# kill KILL 992334 = 1
# All done.
Thanks once again.
[Edited 2022-02-10 hvds to fix formatting]
Hi Team,
We're also facing same, could some one help on this towards resolutions.
Hi Team,
We're also facing same, could some one help on this towards resolutions.
This ticket was closed once the original poster better understood the problem. If you are experiencing a problem, please open a new ticket per the guidelines specified under the green "New Issue" button at https://github.com/Perl/perl5/issues.
It doesn't seem like anything was done to actually fix this issue, but that running the test enough times eventually resulted in a clean run.
I was able to reproduce this issue when installing perl 5.34.1 on Ubuntu 18.04 using perlbrew. This was using an Azure Hyper-V VM. When running on a Hyper-V gen2 VM on Windows 10 I was not able to reproduce. I think this issue could be related to something Azure is doing with time and/or filesystem access. Filesystem is ext4 mounted with relatime in both situations.
Some debug output has been collected here: https://gist.github.com/robinsmidsrod/8649c2e2056ce0e4440ee7e01b6667e2
Note that the
atime
reported by theTime::HiRes::stat
and::lstat
calls don't change from one cycle to the next. Is this because (i) the file we're writing to in each cycle bears the same filename (the value of the PID); and/or (ii) we're onlyunlink
-ing the temp file at the end of all cycles?My investigation suggests that if I
unlink
the tempfile at the end of each cycle, thatatime
values reported advance chronologically as I would expect.
The filesystem is probably mounted as noatime
, which prevents access times from being updated. Creating the file anew makes the new files atime the current time.
I was able to reproduce this issue when installing perl 5.34.1 on Ubuntu 18.04 using perlbrew. This was using an Azure Hyper-V VM. When running on a Hyper-V gen2 VM on Windows 10 I was not able to reproduce. I think this issue could be related to something Azure is doing with time and/or filesystem access. Filesystem is ext4 mounted with relatime in both situations.
@robinsmidsrod It looks like something is accessing the file between the calls to stat() and lstat(), updating the file's atime.
I could see a virus scanner or backup tool doing this if it's monitoring the filesystem for changed, is that likely in your environment?
@tonycoz It's quite likely that a scanner of some sort can modify the atime, as we use Prisma Cloud Defender on the Azure hosts to protect against malware and other intrusions.
Hello. I am working with a team on RHEL 8.5 and, to avoid package updates breaking our application, we are compiling our own perl 5.34.0 from source. We keep running into an issue with the testing of the Time::HiRes module:
I am aware that this has been reported as an issue with utime.t and usleep.t as Bug #122819 (https://rt.cpan.org/Public/Bug/Display.html?id=122819), and apparently Bug #116127 (https://rt.cpan.org/Public/Bug/Display.html?id=116127), but I have not seen it reported as a failure with stat.t.
I am also aware from running perl -V on our system perl that this has apparently been addressed in Fedora Patch58:
Fedora Patch58: Fix unreliable Time-HiRes tests (CPAN RT#122819)
However, since we are compiling from source and we are using the latest perl 5.34.0 directly from CPAN, AND we are still getting this bug, I have tried a number of things to fix it, and they have failed. It seems that there's a difference between the Time::HiRes packaged with perl 5.34.0 (1.9764) and the newest version on CPAN (1.9767), yet still this apparently does not address this Time::HiRes issue when compiling from src on RHEL 8.5.
Can someone kindly help me figure out how to get past this hurdle and patch the perl we are trying to install?
MANY THANKS!