cyrusimap / cassandane

Other
6 stars 11 forks source link

Many test failures relating to bad dates close to the first of the month #23

Open jasontibbitts opened 7 years ago

jasontibbitts commented 7 years ago

Yesterday evening, which for me was about 8PM CDT on May 31st, my normal cassandane run picked up 152 errors. All that I checked were of the form:

1) test_delete(Cassandane::Cyrus::ACL) Perl exception: IMAP Command : 'append' failed. Response was : bad - Invalid date-time in Append command

This morning, with no changes to my packaging, those have all gone away.

Now it's possible that one of the underlying dependencies changed in a buggy way and then was fixed overnight, but I have diffed the installed dependency list between the good and the bad builds and don't see any differences. It's also possible that Cassandane (or even the core Cyrus code, I guess) has some problems with dates near the end of the month. Maybe a timezone problem, where UTC is at the next month but local time isn't. I tried building with TZ=UTC explicitly but it made no difference.

I'm out of ideas so I figured I'd file a ticket just in case someone might be able to make a guess as to what might have happened.

Here's the output from just one of those failing tests, though there doesn't appear to be anything particularly useful in there.

1) test_delete(Cassandane::Cyrus::ACL)
 Perl exception: IMAP Command : 'append' failed. Response was : bad - Invalid date-time in Append command
 at Cassandane/IMAPMessageStore.pm line 176.
        Cassandane::IMAPMessageStore::write_message(Cassandane::IMAPMessageStore=HASH(0x5aa3fa0), Cassandane::Message=HASH(0x5ae53c0)) called at Cassandane/Cyrus/TestCase.pm line 745
        Cassandane::Cyrus::TestCase::_save_message(Cassandane::Cyrus::ACL=HASH(0x40b0e48), Cassandane::Message=HASH(0x5ae53c0), Cassandane::IMAPMessageStore=HASH(0x5aa3fa0)) called at Cassandane/Cyrus/TestCase.pm line 758
        Cassandane::Cyrus::TestCase::make_message(Cassandane::Cyrus::ACL=HASH(0x40b0e48), "Message A", "store", Cassandane::IMAPMessageStore=HASH(0x5aa3fa0)) called at Cassandane/Cyrus/ACL.pm line 93
        Cassandane::Cyrus::ACL::test_delete(Cassandane::Cyrus::ACL=HASH(0x40b0e48)) called at /usr/share/perl5/vendor_perl/Test/Unit/TestCase.pm line 75
        [...framework calls elided...]
Annotations:
=====> Cyrus::TestCase[567] ---------- BEGIN test_delete ----------
=====> Instance[1000] start main instance for test test_delete: basedir /builddir/build/BUILD/cyrus-imapd-3.0.1/cassandane/work/023333A1
=====> Instance[451] Found binary cyr_buildinfo in /builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/sbin
=====> Instance[1501] Running: "/builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/sbin/cyr_buildinfo" "-C" "/builddir/build/BUILD/cyrus-imapd-3.0.1/cassandane/work/023333A1/conf/imapd.conf"
=====> Instance[451] Found binary ctl_cyrusdb in /builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/sbin
=====> Instance[451] Found binary imapd in /builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/libexec/cyrus-imapd
=====> Instance[912] started saslauthd for /builddir/build/BUILD/cyrus-imapd-3.0.1/cassandane/work/023333A1 as 6627
=====> Instance[939] started notifyd for /builddir/build/BUILD/cyrus-imapd-3.0.1/cassandane/work/023333A1 as 6633
=====> Instance[1755] opening socket /builddir/build/BUILD/cyrus-imapd-3.0.1/cassandane/work/023333A1/run/mux
=====> Instance[451] Found binary master in /builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/libexec/cyrus-imapd
=====> Instance[1501] Running: "/builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/libexec/cyrus-imapd/master" "-C" "/builddir/build/BUILD/cyrus-imapd-3.0.1/cassandane/work/023333A1/conf/imapd.conf" "-l" "255" "-p" "/builddir/build/BUILD/cyrus-imapd-3.0.1/cassandane/work/023333A1/run/master.pid" "-d" "-M" "/builddir/build/BUILD/cyrus-imapd-3.0.1/cassandane/work/023333A1/conf/cyrus.conf"
=====> Instance[864] _start_master: waiting for PID file
=====> Instance[867] _start_master: PID file present and correct
=====> Instance[879] _start_master: PID waiting for services
=====> Daemon[358] is_listening: service imap is listening on 127.0.0.1:9100
=====> Util::Wait[77] Waited 0.068175 sec for 127.0.0.1:9100 to be in LISTEN state
=====> Instance[890] _start_master: all services listening
=====> Instance[966] create user cassandane
=====> Util::Socket[63] create_client_socket INET host=127.0.0.1 port=9100
=====> Instance[1770] authdaemon connection: admin testpw imap
=====> Instance[1031] started main instance for test test_delete: cyrus version 3.0.1-1.fc27 Fedora
=====> Cyrus::TestCase[574] Calling test function
=====> Util::Socket[63] create_client_socket INET host=127.0.0.1 port=9100
=====> Instance[1770] authdaemon connection: admin testpw imap
=====> Instance[966] create user archive
=====> Util::Socket[63] create_client_socket INET host=127.0.0.1 port=9100
=====> Instance[1770] authdaemon connection: admin testpw imap
=====> Util::Socket[63] create_client_socket INET host=127.0.0.1 port=9100
=====> Instance[1770] authdaemon connection: cassandane testpw imap
=====> Cyrus::TestCase[681] Beginning tear_down
=====> Instance[1238] stop
=====> Instance[1200] _stop_pid: sending signal 15 to 6675
=====> Util::Wait[77] Waited 0.025181 sec for unknown condition
=====> Instance[915] killing saslauthd 6627
=====> Instance[942] killing notifyd 6633
=====> Instance[915] killing saslauthd 6627
=====> Cyrus::TestCase[726] ---------- END test_delete ----------
=====> Daemon[387] checking for stray processes on ports: 9100

And just in case it helps (though I don't think it will), this is what gets logged to syslog for that 023333A1 test:

May 31 21:33:34 X 023333A1/master[6675]: process started
May 31 21:33:34 X 023333A1/master[6687]: about to exec /builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/sbin/ctl_cyrusdb
May 31 21:33:34 X 023333A1/ctl_cyrusdb[6687]: skiplist: clean shutdown file missing, updating recovery stamp
May 31 21:33:34 X 023333A1/ctl_cyrusdb[6687]: SQL backend defaulting to engine 'pgsql'
May 31 21:33:34 X 023333A1/ctl_cyrusdb[6687]: recovering cyrus databases
May 31 21:33:34 X 023333A1/ctl_cyrusdb[6687]: done recovering cyrus databases
May 31 21:33:34 X 023333A1/master[6675]: process type:START name:recover path:/builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/sbin/ctl_cyrusdb age:0.000s pid:6687 exited normally
May 31 21:33:34 X 023333A1/master[6675]: ready for work
May 31 21:33:34 X 023333A1/master[6720]: about to exec /builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/libexec/cyrus-imapd/imapd
May 31 21:33:34 X 023333A1/imap[6720]: SQL backend defaulting to engine 'pgsql'
May 31 21:33:34 X 023333A1/imap[6720]: executed
May 31 21:33:34 X 023333A1/imap[6720]: accepted connection
May 31 21:33:34 X 023333A1/imap[6720]: command: 1 Capability
May 31 21:33:34 X 023333A1/imap[6720]: command: 2 Login
May 31 21:33:34 X 023333A1/imap[6720]: login: X [127.0.0.1] admin plaintext User logged in SESSIONID=<023333A1-6720-1496284414-1-6103230945219093468>
May 31 21:33:34 X 023333A1/imap[6720]: command: 3 Create
May 31 21:33:34 X 023333A1/imap[6720]: auditlog: create sessionid=<023333A1-6720-1496284414-1-6103230945219093468> mailbox=<user.cassandane> uniqueid=<e1607c49-309c-430f-8064-913b8204de25> uidvalidity=<1496284415>
May 31 21:33:34 X 023333A1/imap[6720]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6720]: command: 4 Setacl
May 31 21:33:34 X 023333A1/imap[6720]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6720]: auditlog: modseq sessionid=<023333A1-6720-1496284414-1-6103230945219093468> mailbox=<user.cassandane> uniqueid=<e1607c49-309c-430f-8064-913b8204de25> highestmodseq=<2>
May 31 21:33:34 X 023333A1/imap[6720]: command: 5 Setacl
May 31 21:33:34 X 023333A1/imap[6720]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6720]: command: 6 Setacl
May 31 21:33:34 X 023333A1/imap[6720]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6720]: auditlog: modseq sessionid=<023333A1-6720-1496284414-1-6103230945219093468> mailbox=<user.cassandane> uniqueid=<e1607c49-309c-430f-8064-913b8204de25> highestmodseq=<3>
May 31 21:33:34 X 023333A1/imap[6720]: command: 7 Logout
May 31 21:33:34 X 023333A1/imap[6720]: USAGE admin user: 0.014051 sys: 0.008384
May 31 21:33:34 X 023333A1/imap[6720]: auditlog: traffic sessionid=<023333A1-6720-1496284414-1-6103230945219093468> bytes_in=<207> bytes_out=<1600>
May 31 21:33:34 X 023333A1/imap[6720]: accepted connection
May 31 21:33:34 X 023333A1/imap[6720]: command: 1 Capability
May 31 21:33:34 X 023333A1/imap[6720]: command: 2 Login
May 31 21:33:34 X 023333A1/imap[6720]: login: X [127.0.0.1] admin plaintext User logged in SESSIONID=<023333A1-6720-1496284414-2-4184462040807062084>
May 31 21:33:34 X 023333A1/master[6739]: about to exec /builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/libexec/cyrus-imapd/imapd
May 31 21:33:34 X 023333A1/imap[6739]: SQL backend defaulting to engine 'pgsql'
May 31 21:33:34 X 023333A1/imap[6739]: executed
May 31 21:33:34 X 023333A1/imap[6739]: accepted connection
May 31 21:33:34 X 023333A1/imap[6739]: command: 1 Capability
May 31 21:33:34 X 023333A1/imap[6739]: command: 2 Login
May 31 21:33:34 X 023333A1/imap[6739]: login: X [127.0.0.1] admin plaintext User logged in SESSIONID=<023333A1-6739-1496284414-1-10713178190874967929>
May 31 21:33:34 X 023333A1/imap[6739]: command: 3 Create
May 31 21:33:34 X 023333A1/imap[6739]: auditlog: create sessionid=<023333A1-6739-1496284414-1-10713178190874967929> mailbox=<user.archive> uniqueid=<0535db4d-913b-4814-b311-c0d3c8f6080c> uidvalidity=<1496284415>
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: command: 4 Setacl
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: auditlog: modseq sessionid=<023333A1-6739-1496284414-1-10713178190874967929> mailbox=<user.archive> uniqueid=<0535db4d-913b-4814-b311-c0d3c8f6080c> highestmodseq=<2>
May 31 21:33:34 X 023333A1/imap[6739]: command: 5 Setacl
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: command: 6 Setacl
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: auditlog: modseq sessionid=<023333A1-6739-1496284414-1-10713178190874967929> mailbox=<user.archive> uniqueid=<0535db4d-913b-4814-b311-c0d3c8f6080c> highestmodseq=<3>
May 31 21:33:34 X 023333A1/imap[6739]: command: 7 Create
May 31 21:33:34 X 023333A1/imap[6739]: auditlog: create sessionid=<023333A1-6739-1496284414-1-10713178190874967929> mailbox=<user.archive.cassandane> uniqueid=<b2b7f3f1-8aca-414a-a3df-6f0d77bf3d98> uidvalidity=<1496284415>
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: command: 8 Setacl
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: command: 9 Setacl
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: command: 10 Setacl
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: command: 11 Create
May 31 21:33:34 X 023333A1/imap[6739]: auditlog: create sessionid=<023333A1-6739-1496284414-1-10713178190874967929> mailbox=<user.archive.cassandane.sent> uniqueid=<82567d89-a0c0-493c-99eb-d7d95950f842> uidvalidity=<1496284415>
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: command: 12 Setacl
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: command: 13 Setacl
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: command: 14 Setacl
May 31 21:33:34 X 023333A1/imap[6739]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6739]: command: 15 Logout
May 31 21:33:34 X 023333A1/imap[6739]: USAGE admin user: 0.018378 sys: 0.005736
May 31 21:33:34 X 023333A1/imap[6720]: command: 3 Setacl
May 31 21:33:34 X 023333A1/imap[6739]: auditlog: traffic sessionid=<023333A1-6739-1496284414-1-10713178190874967929> bytes_in=<594> bytes_out=<1735>
May 31 21:33:34 X 023333A1/imap[6720]: NOTIFY: response OK to method pusher
May 31 21:33:34 X 023333A1/imap[6720]: auditlog: modseq sessionid=<023333A1-6720-1496284414-2-4184462040807062084> mailbox=<user.archive.cassandane.sent> uniqueid=<82567d89-a0c0-493c-99eb-d7d95950f842> highestmodseq=<2>
May 31 21:33:34 X 023333A1/imap[6739]: accepted connection
May 31 21:33:34 X 023333A1/imap[6739]: command: 1 Capability
May 31 21:33:34 X 023333A1/imap[6739]: command: 2 Login
May 31 21:33:34 X 023333A1/imap[6739]: login: X [127.0.0.1] cassandane plaintext User logged in SESSIONID=<023333A1-6739-1496284414-2-14844595030300103750>
May 31 21:33:34 X 023333A1/imap[6720]: command: 4 Select
May 31 21:33:34 X 023333A1/imap[6720]: seen_db: user admin opened /builddir/build/BUILD/cyrus-imapd-3.0.1/cassandane/work/023333A1/conf/user/a/admin.seen
May 31 21:33:34 X 023333A1/imap[6720]: open: user admin opened user.archive.cassandane.sent
May 31 21:33:34 X 023333A1/imap[6720]: command: 5 Append
May 31 21:33:34 X 023333A1/imap[6739]: command: 3 Logout
May 31 21:33:34 X 023333A1/imap[6739]: USAGE cassandane user: 0.000689 sys: 0.000215
May 31 21:33:34 X 023333A1/imap[6739]: auditlog: traffic sessionid=<023333A1-6739-1496284414-2-14844595030300103750> bytes_in=<53> bytes_out=<1537>
May 31 21:33:34 X 023333A1/imap[6720]: command: 6 Logout
May 31 21:33:34 X 023333A1/imap[6720]: USAGE admin user: 0.001648 sys: 0.000984
May 31 21:33:34 X 023333A1/imap[6720]: auditlog: traffic sessionid=<023333A1-6720-1496284414-2-4184462040807062084> bytes_in=<219> bytes_out=<1930>
May 31 21:33:34 X 023333A1/master[6675]: attempting clean shutdown on signal
May 31 21:33:34 X 023333A1/imap[6739]: graceful shutdown
May 31 21:33:34 X 023333A1/imap[6720]: graceful shutdown
May 31 21:33:34 X 023333A1/imap[6739]: auditlog: traffic sessionid=<023333A1-6739-1496284414-2-14844595030300103750> bytes_in=<0> bytes_out=<0>
May 31 21:33:34 X 023333A1/imap[6720]: auditlog: traffic sessionid=<023333A1-6720-1496284414-2-4184462040807062084> bytes_in=<0> bytes_out=<0>
May 31 21:33:34 X 023333A1/master[6675]: process type:SERVICE name:imap path:/builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/libexec/cyrus-imapd/imapd age:0.227s pid:6720 exited, status 75
May 31 21:33:34 X 023333A1/master[6675]: process type:SERVICE name:imap path:/builddir/build/BUILDROOT/cyrus-imapd-3.0.1-1.fc27.x86_64/usr/libexec/cyrus-imapd/imapd age:0.194s pid:6739 exited, status 75
May 31 21:33:34 X 023333A1/master[6675]: All children have exited, closing down
elliefm commented 7 years ago

If you have the opportunity (since the month is about to roll over again), are you able to reproduce this with the following patch in place?

diff --git a/imap/imapd.c b/imap/imapd.c
index 3dfb5bb..d73bc1c 100644
--- a/imap/imapd.c
+++ b/imap/imapd.c
@@ -12313,8 +12313,11 @@ static int getdatetime(time_t *date)
     buf[i] = '\0';

     r = time_from_rfc3501(buf, date);
-    if (r < 0)
+    if (r < 0) {
+        syslog(LOG_DEBUG, "%s: time_from_rfc3501 rejected date string: '%s'",
+                          __func__, buf);
         goto baddate;
+    }

     c = prot_getc(imapd_in);
     return c;

My hunch at the moment is that this is going to turn out to be some sort of bug in Cassandane, but it'd be very useful to see what's actually ending up in Cyrus

elliefm commented 7 years ago

Though it might also be somehow related to this: https://github.com/cyrusimap/cyrus-imapd/issues/2044

elliefm commented 7 years ago

Suspect something going awry here: https://github.com/cyrusimap/cassandane/blob/bcefcdb220b32f83bd468fce8fad7080e9d6ed53/Cassandane/Util/DateTime.pm#L190-L205

In the case of the test failure in earlier comments, I think we get in here with no arguments, so I guess localtime() will be returning the current time (I'm not sure why it doesn't whinge about $dt->epoch though?)

If you get a window to reproduce this, can you please do so with this patch applied to Cassandane and -vvv:

diff --git a/Cassandane/Generator.pm b/Cassandane/Generator.pm
index 7abcbfd..6144eac 100644
--- a/Cassandane/Generator.pm
+++ b/Cassandane/Generator.pm
@@ -49,6 +49,7 @@ use Cassandane::Util::DateTime qw(to_rfc822 from_iso8601);
 use Cassandane::Address;
 use Cassandane::Message;
 use Cassandane::Util::SHA;
+use Cassandane::Util::Log;

 our $admin = 'qa@cyrus.works';

@@ -281,6 +282,7 @@ sub generate
     my ($self, @aparams) = @_;
     my $params = $self->_params_defaults(@aparams);
     my $datestr = to_rfc822($params->{date});
+    xlog "generating message with datestr '$datestr'";
     my $from = $params->{from};
     my $to = $params->{to};
     my $extra_lines = $params->{extra_lines};
elliefm commented 7 years ago

Scratch that last patch, here's a more helpful one (additional logging):

diff --git a/Cassandane/Generator.pm b/Cassandane/Generator.pm
index 7abcbfd..b352253 100644
--- a/Cassandane/Generator.pm
+++ b/Cassandane/Generator.pm
@@ -49,6 +49,7 @@ use Cassandane::Util::DateTime qw(to_rfc822 from_iso8601);
 use Cassandane::Address;
 use Cassandane::Message;
 use Cassandane::Util::SHA;
+use Cassandane::Util::Log;

 our $admin = 'qa@cyrus.works';

@@ -280,7 +281,9 @@ sub generate
 {
     my ($self, @aparams) = @_;
     my $params = $self->_params_defaults(@aparams);
+    xlog "entered generate with date param '$params->{date}'";
     my $datestr = to_rfc822($params->{date});
+    xlog "generating message with datestr '$datestr'";
     my $from = $params->{from};
     my $to = $params->{to};
     my $extra_lines = $params->{extra_lines};
diff --git a/Cassandane/Message.pm b/Cassandane/Message.pm
index b4b7036..93b42d2 100644
--- a/Cassandane/Message.pm
+++ b/Cassandane/Message.pm
@@ -432,6 +432,7 @@ sub set_internaldate
     {
        $id = to_rfc3501($id);
     }
+    xlog "setting internaldate to '$id'";
     $self->set_attribute(internaldate => $id);
 }
elliefm commented 4 years ago

I think I have finally tripped this myself, on 2020-05-01 at 11:36am AEST (24 minutes before it became May in UTC)

Curiously, the only test that seemed to fail as a direct result was JMAPEmail.email_get. I wonder if this was already partially fixed at some point?

The failure in JMAPEmail.email_get is happening because, in Cassandane::Util::DateTime, the to_rfc822 and to_rfc3501 functions disagree about the date.

to_rfc822 is used to format the Date header, and the dates in the Received headers. to_rfc3501 is used to format the timestamp in the APPEND command

Here's what the append looks like during the window where the problem occurs:

C: 4 append inbox " 1-Apr-2020 09:52:20 +1000" {811}
S: + go ahead
[...]
Date: Fri, 01 May 2020 09:52:20 +1000

The same DateTime object is used to format both of these dates (in Cassandane::Generator), and is used again later (in the test) when checking that the message has the correct fields. But since the functions produced different dates in the message, one of those checks will definitely fail.

Once UTC time ticks over to the next month, the test starts to pass again, and the window for testing this closes. I don't know when the window opens, but I suspect it's probably "when the local date's month starts to differ from the UTC date's month".

I wonder if this is even more wonky if your timezone is behind UTC, and not ahead of it like mine -- maybe that explains @jasontibbitts getting many more errors in this window than I do?

At time of writing, this is the to_rfc822 implementation: https://github.com/cyrusimap/cassandane/blob/037d5bdd4663adf16448ec87d673bd684ecfe7f9/Cassandane/Util/DateTime.pm#L186-L203

And this is the to_rfc3501 implementation: https://github.com/cyrusimap/cassandane/blob/037d5bdd4663adf16448ec87d673bd684ecfe7f9/Cassandane/Util/DateTime.pm#L247-L254

Some things that are worth observing, that may turn out to be meaningful:

The fix might be as simple as making to_rfc3501 decompose the object first in the same way that to_rfc822 does, but I can't verify that anymore today!

elliefm commented 4 years ago

Archive.archive_messages and Archive.archive_messages_archive_annotation were also consistently failing during this window, though they didn't look related when I looked at them.

But they're passing again now with no changes, so maybe they are related, but in a less obvious way.