Closed trwyant closed 6 months ago
POSIX::strftime() behavior change in Perl 5.39.8.
I have observed a behavior change in
POSIX::strftime()
in Perl 5.39.8. This is most succinctly demonstrated by this code:say strftime '%Y-%m-%d %H:%M:%S', gmtime $epoch;
If the
$epoch
represents a time when the local zone is shifted to Summer time, the output will be an hour later than the actual time. That is, at 2024-05-01 23:30:00 the output will be2024-05-02 00:30:00
.Perls before 5.39.8 do not exhibit this behavior.
Using this program:
$ cat gh-22062-gmtime.pl
#!/usr/bin/env perl
use 5.14.0;
use warnings;
use Time::Local ( qw| timegm_posix | );
use POSIX ( qw| strftime | );
my $string = strftime '%Y-%m-%d %H:%M:%S',
gmtime timegm_posix( 0, 30, 23, 14, 7-1, 2023-1900 );
say $string;
my ($day_of_month) = $string =~ m/^\d{4}-\d{1,2}-(\d{1,2})/;
die "ANOMALY" unless $day_of_month == 14;
... bisection with this invocation:
perl Porting/bisect.pl \
--start=v5.39.7 \
--end=v5.39.8 \
-- ./perl -Ilib /tmp/gh-22062-gmtime.pl
... pointed to 36e41d53ff
36e41d53ff6a5995c16234f8027ef34b5e6a34cc is the first bad commit
commit 36e41d53ff6a5995c16234f8027ef34b5e6a34cc
Author: Karl Williamson <khw@cpan.org>
Date: Wed Jan 10 11:30:11 2024 -0700
Commit: Karl Williamson <khw@cpan.org>
CommitDate: Wed Feb 7 10:59:05 2024 -0700
Don't use both mini_mktime() and mkttime()
@khwilliamson, can you take a look? Thanks.
It appears this commit inadvertently fixed a bug that the examples rely on.
The call of strftime
is documented as:
strftime(fmt, sec, min, hour, mday, mon, year, wday = -1, yday = -1, isdst = -1)
isdst
is one of 0
1
or negative, meaning respectively
So, if the isdst
parameter is omitted, it automatically takes option 3, and calculates the proper value based on its knowledge of the locale.
gmtime
returns in list context, the same parameters as strftime
accepts, and in the same order. But it fills in all of them, including the isdst
one.
perlfunc says
Note: When called in list context, $isdst, the last value returned by gmtime, is always 0. There is no Daylight Saving Time in GMT.
So isdst
always gets set to 0, so strftime
when called directly on the output of gmtime
assumes there is no daylight savings for any input.
Prior to the blamed commit, strftime
ignored its final parameter, so it always calculated if daylight savings is in effect
We could go back to that behavior, but that is contrary to the documentation, and there would be no way to tell strftime
to not do the calculation
I see. But omitting the isdst
indicator simply restricts the behavior change to one hour of the year. During that hour there appears to be no way to correctly format a UT.
$ perl-5.38.2/perl strftime-try 0 0 2 10 2 124
2024-03-10 02:00:00
$ perl-5.39.8/perl strftime-try 0 0 2 10 2 124
2024-03-10 03:00:00
The above was run in Olson zone America/New_York, with a corrected version of strftime-try.
I note that (so far) Time::Piece does not have this problem, but I'm not sure (yet) how that helps me. Will its strftime()
acquire the same behavior as POSIX::strftime()
?
I'm having trouble understanding this. In New York there was no 2am March 10, 2024. Using your program , I get for the instant before:
$ blead strftime-try 59 59 1 10 2 124
2024-03-10 01:59:59
And, then for the next instant, replicating your results:
$ blead strftime-try 0 0 2 10 2 124
2024-03-10 03:00:00
The hour between 2am and 2:59:59am is "lost", to be regained the next November in the United States.
Southhampton Island, in northern Canada is in the same timezone but does not observe daylight savings. Their timezone abbreviation is EST. $ TZ=EST blead strftime-try 0 0 2 10 2 124 2024-03-10 02:00:00
Same for the entire country of Peru; their timezone is PET $ TZ=PET blead strftime-try 0 0 2 10 2 124 2024-03-10 02:00:00
If you want UT, specify that as the time zone $ TZ=UT blead strftime-try 0 0 2 10 2 124 2024-03-10 02:00:00
So it appears to me that the new behavior is accurate. What am I missing?
My goal is to get Time-Piece to use POSIX::strftime(), for several reasons, chief is it doesn't work well under threads and has bugs with POSIX locales
I suspect that we have encountered a smoke-test failure that is due to the problem being discussed in this ticket.
See this log, http://perl.develop-help.com/dblog/5051317, which is the full log from this smoke-test run: https://perl5.test-smoke.org/report/5051326. There are 2 instances of 2 test failures each in cpan/podlators/t/devise-date.t
.
# Failed test 'devise_date matches strftime'
# at t/man/devise-date.t line 31.
# got: '2024-03-10'
# expected: '2024-03-11'
# Failed test 'devise_date ignores invalid SOURCE_DATE_EPOCH'
# at t/man/devise-date.t line 61.
# got: '2024-03-10'
# expected: '2024-03-11'
# Looks like you failed 2 tests of 6.
Here is the first instance of relevant code in cpan/podlators/t/devise-date.t
:
30 my $parser = Pod::Man->new;
31 is(
32 $parser->devise_date,
33 strftime('%Y-%m-%d', gmtime()),
34 'devise_date matches strftime',
35 );
Karl: Maybe I'm missing something. Does POSIX::strftime()
support formatting a GMT time? Or only local time? If GMT is supported, what is the supported way to do it? Am I supposed to do something like
{
local $ENV{TZ} = 'GMT'; # Or whatever
tzset();
say strftime( '%Y-%m-%d %H:%M:%S Zulu', gmtime;
}
tzset();
It is GMT I am trying to format, and no hours disappear in GMT regardless of what local time does.
I'm rather confused by this discussion, perhaps I am missing some common implementation detail of strftime but why would strftime modify the time components it was given based on any timezone setting? It should already receive all of the information it needs, that's the purpose of the separated list output of localtime and gmtime.
I've been seeing the podlators failure, and was having trouble reproducing it. I suspect now it will only happen between 10am and 11am (I'm on UTC+1100 summer time.)
The POSIX 2017 Standard says for strftime
Local timezone information is used as though strftime() called tzset().
The libc strftime
pays attention to the timezone. Perl doesn't control that. I suspect that the previous behavior was unintended, but we could go back to it if that becomes the consensus. Somehow it fed strftime
a different input so that the timezone was offset, but there were no comments that that was happening, and no obvious bit of code that did it, which is why I think it was inadvertent
FWIW, I have added strftime-try.c
to my original strftime repository, along with a Makefile
to build strftime-try-c
from it. When I run this on both the Mac and the Raspberry Pi, I get
$ ./strftime-try-c 0 30 2 10 2 124
2024-03-10 02:30:00
$ ./strftime-try-c 0 30 2 10 2 124 0 0 1
2024-03-10 02:30:00
$ ./strftime-try-c 0 30 2 10 2 124 0 0 0
2024-03-10 02:30:00
which is what I was expecting Perl to do, and which it did do before Perl 5.39.8.
Also FWIW, IEEE 1003.1-2017's description of strftime()
says in part:
Each conversion specifier is replaced by appropriate characters as described in the following list. The appropriate characters are determined using the LC_TIME category of the current locale and by the values of zero or more members of the broken-down time structure pointed to by timeptr, as specified in brackets in the description.
The brackets for conversion specifiers for date and time specify only the corresponding fields in the struct tm
structure, not tm_isdst
.
Now, section 7.2 "POSIX Locale" defers to the ISO C standard. I have not looked into this yet, since it's getting late in America/New_York, and anyway ISO tends to want you to drop a wad of Euros on them to see their standards.
I am just one Perl hobbyist, and it may well be that 5.39.8 fixes a long-standing bug. But having %H generate 3
when tm_hour
is 2
at least violates my personal version of the Principle of Least Surprise. If the 5.39.8 behavior stands, I would like to request, respectfully but fervently, that perl5400delta
call attention to this change and explain the reason that it was made.
Now, section 7.2 "POSIX Locale" defers to the ISO C standard. I have not looked into this yet, since it's getting late in America/New_York, and anyway ISO tends to want you to drop a wad of Euros on them to see their standards.
You can find links to the final working drafts of the C standards at https://en.cppreference.com/w/c/links , which should be good enough for most purposes.
I've been seeing the podlators failure, and was having trouble reproducing it. I suspect now it will only happen between 10am and 11am (I'm on UTC+1100 summer time.)
Well, we now have 16 instances of the failure of devise-date.t
in the core distribution smoke tests. See: https://perl5.test-smoke.org/submatrix?test=../cpan/podlators/t/man/devise-date.t&pversion=5.39.9.
And, not surprisingly, we are getting failures when this distribution is tested via CPAN testers. See: http://fast-matrix.cpantesters.org/?dist=podlators;perl=5.39.9;reports=1. Hence, this is now a BBC situation.
I've been seeing the podlators failure, and was having trouble reproducing it. I suspect now it will only happen between 10am and 11am (I'm on UTC+1100 summer time.)
I set my system time to America/Chicago and ran the test between 10 and 11 am CDT. I did not get a test failure.
$ date; !2011
date; cd t;./perl harness -v ../cpan/podlators/t/man/devise-date.t; cd -
Wed Mar 13 10:17:21 AM CDT 2024
ok 1 - devise_date matches strftime
ok 2 - devise_date honors POD_MAN_DATE
ok 3 - devise_date honors empty POD_MAN_DATE
ok 4 - devise_date honors SOURCE_DATE_EPOCH
ok 5 - devise_date honors POD_MAN_DATE over SOURCE_DATE_EPOCH
ok 6 - devise_date ignores invalid SOURCE_DATE_EPOCH
ok
All tests successful.
Files=1, Tests=6, 0 wallclock secs ( 0.00 usr 0.00 sys + 0.03 cusr 0.01 csys = 0.04 CPU)
Result: PASS
What am I missing?
I set my system time to America/Chicago and ran the test between 10 and 11 am CDT. I did not get a test failure. ... What am I missing?
I live in a UTC+1000 (+1100 in summer) timezone, so between those local times problem described here switches the values generated for strftime() from one day to another between 10am and 11am.
Chicago is UTC-0600 (-0500 in summer) going by google, so I think you'd need to test this between 6pm and 7pm Chicago time.
Rather than brain out the time ../cpan/podlators/t/man/devise-date.t
would fail when run from the t/
directory of the Perl 5.39.8 distro, I just tried every hour of the day. The relevant portion of the code is
my $time = time;
for ( 0 .. 23 ) {
my ( $sec, $min, $hour, $day, $mon, $year ) = localtime $time;
say sprintf '%04d-%02d-%02d %02d:%02d:%02d', $year + 1900, $mon + 1, $day, $hour, $min, $sec;
system '../perl', "-MTest::Time=time,$time", qw{ ../cpan/podlators/t/man/devise-date.t };
$time += 3600;
}
In America/New_York tests 1 and 6 failed between 8 and 9 PM. I was unable to get failures when going through harness
, I assume because it sanitizes the environment.
Correction to the above: I should have used Test::MockTime, but my brain wouldn't come up with it when I needed it. The problem with Test::Time
is that it does not touch gmtime()
, rendering my test failures themselves problematic. The relevant code, run from the top-level directory this time, is
my $time = time;
for ( 0 .. 23 ) {
my ( $sec, $min, $hour, $day, $mon, $year ) = localtime $time;
system './perl', '-MTest::MockTime=:all', '-e', "set_absolute_time($time); do './cpan/podlators/t/man/devise-date.t';";
$time += 3600;
}
And just now I happened to run the test suite during the 1-hour failure window, and got the test failure noted above. Running just that file:
$ date; cd t; ./perl harness ../cpan/podlators/t/man/devise-date.t; cd -
Sun Mar 17 07:05:49 PM EDT 2024
# Failed test 'devise_date matches strftime'
# at t/man/devise-date.t line 31.
# got: '2024-03-17'
# expected: '2024-03-18'
../cpan/podlators/t/man/devise-date.t .. 1/6
# Failed test 'devise_date ignores invalid SOURCE_DATE_EPOCH'
# at t/man/devise-date.t line 61.
# got: '2024-03-17'
# expected: '2024-03-18'
# Looks like you failed 2 tests of 6.
../cpan/podlators/t/man/devise-date.t .. Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/6 subtests
Test Summary Report
-------------------
../cpan/podlators/t/man/devise-date.t (Wstat: 512 (exited 2) Tests: 6 Failed: 2)
Failed tests: 1, 6
Non-zero exit status: 2
Files=1, Tests=6, 0 wallclock secs ( 0.01 usr 0.00 sys + 0.03 cusr 0.00 csys = 0.04 CPU)
Result: FAIL
Finished test run at Sun Mar 17 19:05:49 2024.
Yup, this is a release blocker.
The problem here is the mktime
call. mktime
only operates on local times. It "normalizes" the fields in the mytm
struct, which for the problematic cases here includes adjusting the time to match the local DST, changing tm_isdst
from 0 to 1 and changing the hour.
This is actually the exact reason the mini_mktime
function exists. This was added to fix #838 : 33c0e3ec3c9b4103fd0146ace1e9048f01c2013d.
I don't think it's appropriate to force mktime
on strftime
. I think 36e41d53ff6a5995c16234f8027ef34b5e6a34cc should be reverted.
Still there in 5.39.9.
Why is it appropriate for Perl's strftime()
to return a different result than C's strftime()
, given the same input?
I agree with @haarg that the problem is introduced by letting mktime
canonicalize rather than only getting two fields from it. It's that canonicalization that I think applies the local daylight saving time rules and results in shifting the date by an hour even though the $isdst
flag is false.
I believe the current behavior effectively means that gmtime
output is not a valid input to strftime
in the sense that it will produce surprising results that are in neither the local time zone nor in UTC. That feels like it would be a surprising change that would be likely to break more than just the Pod::Man test suite.
It's unfortunate that Perl started canonicalizing the input to strftime
at all. C avoids this problem by not doing any canonicalization; whatever you pass in via the struct tm
is what's used in the output. See, for example:
#include <stdio.h>
#include <time.h>
int
main(void)
{
time_t now;
struct tm *t;
char buf[BUFSIZ];
now = time(NULL);
t = gmtime(&now);
t->tm_mday = 56;
strftime(buf, sizeof(buf), "%Y-%m-%d", t);
printf("%s\n", buf);
return 0;
}
which at the time of this writing prints out 2024-03-56
. But we of course document that we do canonicalize so that can't be changed now:
The given arguments are made consistent as though by calling mktime() before calling your system's strftime() function, except that the "isdst" value is not affected.
I think that last sentence is the critical part, and would argue that the behavior in blead doesn't follow that promise (it applies a daylight saving rule even though $isdst
is 0). But whatever behavior we choose, it could use some more specific documentation.
I changed the Pod::Man test suite to stop calling strftime
on gmtime
, but after sleeping on this for a bit and reviewing this bug, I think I've convinced myself that the test caught a real bug (albeit not one in Pod::Man) and I should keep it as it previously was.
On Sun, Mar 17, 2024 at 11:26:28PM -0700, Graham Knop wrote:
The problem here is the
mktime
call.mktime
only operates on local times. It "normalizes" the fields in themytm
struct, which for the problematic cases here includes adjusting the time to match the local DST, changingtm_isdst
from 0 to 1 and changing the hour.This is actually the exact reason the
mini_mktime
function exists. This was added to fix #838 : 33c0e3ec3c9b4103fd0146ace1e9048f01c2013d.
Indeed. To quote the commit message from 1999:
normalize time for strftime() (without the isdst effects of
mktime()) using a custom mini_mktime()
Also, consider the code comments above the newly-added function:
/*
* mini_mktime - normalise struct tm values without the localtime()
* semantics (and overhead) of mktime().
*/
static void
mini_mktime(struct tm *ptm)
So it seems that it was very intentional that mini_mktime() would have different DST semantics than mktime(), and that this was specifically to give strftime() the "correct" semantics.
So making strftime use mktime() rather than mini_mktime() looks (on the face of it) to be wrong, and should be reverted.
In any case, we need to decide what to do sometime soon, as cpan/podlators/t/man/devise-date.t failures are kind of blowing up our smokers, and we definitely can't release 5.40.0 in this state.
-- "Emacs isn't a bad OS once you get used to it. It just lacks a decent editor."
https://github.com/Perl/perl5/pull/22119 reverts the blamed commit, clarifies the documentation, and makes unpublic an XS-callable function introduced earlier in the 5.39 series, hence has never appeared in a stable release. I'm doing that because I do view it as a bug that you can't choose whether to get daylight savings time versus not getting it. This will continue to be the case from Perl code, but there ought to be an API for XS code enabling the choice. I think it is too late in the development cycle to have a discussion about this and come to a reasonable conclusion, so this commit just defers the need for that. Note that the other new API function is retained, sv_strftime_tm()
. It takes a filled out tm
struct and doesn't do any normalization on it, so an XS writer can get what @rra suggested.
I find it interesting that there have been a lot of issues and poor decisions and poor communication about strftime()
, starting with its defective specification. You can get a NULL return for legal input, but it is also the returned value for illegal input, and no errno is set (except on Windows) to indicate the difference. Perl has employed heuristics to tease apart the difference (these failed for some locales until earlier in 5.39), but newer POSIX specifications make those heuristics more problematic in legal but crazy inputs.
Part of the PR message @haarg referred to (thanks for doing the archaeology on this, BTW) says
Thus, perl MUST NOT call mktime() in strftime() because strftime MUST ONLY PRINT its data and MUST NOT have any opinion of its content because it cannot know real time zone of the data.
This is wrong, because strftime()
does know the real time zone of its data. Also, it is surprising to me that the commit is fromspider-perl
, and the commentary makes it sound like it was written specifically for this commit or purpose, yet the internal comments of the added function indicate it was written by "lwall". Perhaps it was repurposed from somewhere else.
The commit message also says
One cannot do without this normalisation if the optionality of the three trailing parameters is to be preserved.
This also not true. All three of those parameters are ignored. They don't need to exist, and it is confusing that they do. My new commit cleans up the text about those. It removes the documentation of what the proper values of two of them should be. There are no improper values, since they are ignored.
Plain POSIX mktime()
also ignores those two parameters, and it turns out that the third, isdst
is also effectively ignored. Closer reading of the Standard adds the word initially
to its discussion of what the values for it mean. This means it is only a hint to the libc function, and you can't turn off finding out if daylight savings time is in effect.
Since we have long turned that off, mini_mktime()
is required.
POSIX::strftime() behavior change in Perl 5.39.8.
I have observed a behavior change in
POSIX::strftime()
in Perl 5.39.8. This is most succinctly demonstrated by this code:If the
$epoch
represents a time when the local zone is shifted to Summer time, the output will be an hour later than the actual time. That is, at 2024-05-01 23:30:00 the output will be2024-05-02 00:30:00
.Perls before 5.39.8 do not exhibit this behavior.
The behavior seems to be related to the value of element
[8]
of the returned array, theisdst
element. If this is0
(as it is by definition in the case ofgmtime()
), the behavior under Perl 5.39.8 is as described in the previous paragraph. If this element is absent or-1
, the output is an hour later only if the results ofgmtime()
specify a time which is during the jump to Summer time.These tests were done under Olson zone
America/New_York
, and with theTZ
environment variable undefined. I note that the core Perl testext/POSIX/t/time.t
jumps through all sorts of hoops to try to be portable among time zones. My read of this says that the steps relevant to formattinggmtime
output areDoing these does not affect the behavior of the code under any combination of OS and Perl I have tried. But setting
$ENV{TZ}
to any one ofGMT
,UTC
, orUTC0
does cause Perl 5.39.8 to behave like previous Perls, even in the absence of thetzset()
.I have read through the
perldelta.pod
for Perl 5.39.8 and not found this called out as a deliberate change. Adiff -u
between the 5.39.7 and 5.39.8 versions ofPOSIX.xs
finds two changes, but it is not obvious to me that either is relevant.In addition to this document, this Git repository contains two Perl scripts:
strftime-year
This script takes a year on the command line (defaulting to 2024), and formats the date and time for every 30 minutes throughout the year using both
strftime()
andsprintf()
on the output ofgmtime()
. If it detects intervals when the output of the two are not the same, it reports the first time where the difference occurs, and the first time where there ceases to be a difference. You can get more information usingstrftime-year --help
.strftime-try
This script formats its command-line arguments using
strftime()
and prints the results. The arguments are used verbatim. You can get more information usingstrftime-try --help
.Relevant data on the macOS Perl 5.39.8 build: