Open jasontibbitts opened 7 years ago
If you have the opportunity (since the month is about to roll over again), are you able to reproduce this with the following patch in place?
diff --git a/imap/imapd.c b/imap/imapd.c
index 3dfb5bb..d73bc1c 100644
--- a/imap/imapd.c
+++ b/imap/imapd.c
@@ -12313,8 +12313,11 @@ static int getdatetime(time_t *date)
buf[i] = '\0';
r = time_from_rfc3501(buf, date);
- if (r < 0)
+ if (r < 0) {
+ syslog(LOG_DEBUG, "%s: time_from_rfc3501 rejected date string: '%s'",
+ __func__, buf);
goto baddate;
+ }
c = prot_getc(imapd_in);
return c;
My hunch at the moment is that this is going to turn out to be some sort of bug in Cassandane, but it'd be very useful to see what's actually ending up in Cyrus
Though it might also be somehow related to this: https://github.com/cyrusimap/cyrus-imapd/issues/2044
Suspect something going awry here: https://github.com/cyrusimap/cassandane/blob/bcefcdb220b32f83bd468fce8fad7080e9d6ed53/Cassandane/Util/DateTime.pm#L190-L205
In the case of the test failure in earlier comments, I think we get in here with no arguments, so I guess localtime() will be returning the current time (I'm not sure why it doesn't whinge about $dt->epoch
though?)
If you get a window to reproduce this, can you please do so with this patch applied to Cassandane and -vvv
:
diff --git a/Cassandane/Generator.pm b/Cassandane/Generator.pm
index 7abcbfd..6144eac 100644
--- a/Cassandane/Generator.pm
+++ b/Cassandane/Generator.pm
@@ -49,6 +49,7 @@ use Cassandane::Util::DateTime qw(to_rfc822 from_iso8601);
use Cassandane::Address;
use Cassandane::Message;
use Cassandane::Util::SHA;
+use Cassandane::Util::Log;
our $admin = 'qa@cyrus.works';
@@ -281,6 +282,7 @@ sub generate
my ($self, @aparams) = @_;
my $params = $self->_params_defaults(@aparams);
my $datestr = to_rfc822($params->{date});
+ xlog "generating message with datestr '$datestr'";
my $from = $params->{from};
my $to = $params->{to};
my $extra_lines = $params->{extra_lines};
Scratch that last patch, here's a more helpful one (additional logging):
diff --git a/Cassandane/Generator.pm b/Cassandane/Generator.pm
index 7abcbfd..b352253 100644
--- a/Cassandane/Generator.pm
+++ b/Cassandane/Generator.pm
@@ -49,6 +49,7 @@ use Cassandane::Util::DateTime qw(to_rfc822 from_iso8601);
use Cassandane::Address;
use Cassandane::Message;
use Cassandane::Util::SHA;
+use Cassandane::Util::Log;
our $admin = 'qa@cyrus.works';
@@ -280,7 +281,9 @@ sub generate
{
my ($self, @aparams) = @_;
my $params = $self->_params_defaults(@aparams);
+ xlog "entered generate with date param '$params->{date}'";
my $datestr = to_rfc822($params->{date});
+ xlog "generating message with datestr '$datestr'";
my $from = $params->{from};
my $to = $params->{to};
my $extra_lines = $params->{extra_lines};
diff --git a/Cassandane/Message.pm b/Cassandane/Message.pm
index b4b7036..93b42d2 100644
--- a/Cassandane/Message.pm
+++ b/Cassandane/Message.pm
@@ -432,6 +432,7 @@ sub set_internaldate
{
$id = to_rfc3501($id);
}
+ xlog "setting internaldate to '$id'";
$self->set_attribute(internaldate => $id);
}
I think I have finally tripped this myself, on 2020-05-01 at 11:36am AEST (24 minutes before it became May in UTC)
Curiously, the only test that seemed to fail as a direct result was JMAPEmail.email_get. I wonder if this was already partially fixed at some point?
The failure in JMAPEmail.email_get is happening because, in Cassandane::Util::DateTime, the to_rfc822
and to_rfc3501
functions disagree about the date.
to_rfc822
is used to format the Date header, and the dates in the Received headers. to_rfc3501
is used to format the timestamp in the APPEND command
Here's what the append looks like during the window where the problem occurs:
C: 4 append inbox " 1-Apr-2020 09:52:20 +1000" {811}
S: + go ahead
[...]
Date: Fri, 01 May 2020 09:52:20 +1000
The same DateTime object is used to format both of these dates (in Cassandane::Generator), and is used again later (in the test) when checking that the message has the correct fields. But since the functions produced different dates in the message, one of those checks will definitely fail.
Once UTC time ticks over to the next month, the test starts to pass again, and the window for testing this closes. I don't know when the window opens, but I suspect it's probably "when the local date's month starts to differ from the UTC date's month".
I wonder if this is even more wonky if your timezone is behind UTC, and not ahead of it like mine -- maybe that explains @jasontibbitts getting many more errors in this window than I do?
At time of writing, this is the to_rfc822
implementation: https://github.com/cyrusimap/cassandane/blob/037d5bdd4663adf16448ec87d673bd684ecfe7f9/Cassandane/Util/DateTime.pm#L186-L203
And this is the to_rfc3501
implementation:
https://github.com/cyrusimap/cassandane/blob/037d5bdd4663adf16448ec87d673bd684ecfe7f9/Cassandane/Util/DateTime.pm#L247-L254
Some things that are worth observing, that may turn out to be meaningful:
to_rfc822
decomposes the DateTime object with localtime($dt->epoch)
before using its components, whereas to_rfc3501
uses the DateTime object's month directlyto_rfc3501
subtracts one from the DateTime object's month, which is presumably meant to compensate for 1-indexed vs 0-indexed months, but it smells bad. Note specifically that the date it produces during the bad window is exactly one calendar month earlier than it should beThe fix might be as simple as making to_rfc3501
decompose the object first in the same way that to_rfc822
does, but I can't verify that anymore today!
Archive.archive_messages
and Archive.archive_messages_archive_annotation
were also consistently failing during this window, though they didn't look related when I looked at them.
But they're passing again now with no changes, so maybe they are related, but in a less obvious way.
Yesterday evening, which for me was about 8PM CDT on May 31st, my normal cassandane run picked up 152 errors. All that I checked were of the form:
1) test_delete(Cassandane::Cyrus::ACL) Perl exception: IMAP Command : 'append' failed. Response was : bad - Invalid date-time in Append command
This morning, with no changes to my packaging, those have all gone away.
Now it's possible that one of the underlying dependencies changed in a buggy way and then was fixed overnight, but I have diffed the installed dependency list between the good and the bad builds and don't see any differences. It's also possible that Cassandane (or even the core Cyrus code, I guess) has some problems with dates near the end of the month. Maybe a timezone problem, where UTC is at the next month but local time isn't. I tried building with TZ=UTC explicitly but it made no difference.
I'm out of ideas so I figured I'd file a ticket just in case someone might be able to make a guess as to what might have happened.
Here's the output from just one of those failing tests, though there doesn't appear to be anything particularly useful in there.
And just in case it helps (though I don't think it will), this is what gets logged to syslog for that 023333A1 test: