cyrusimap / cassandane

Other
6 stars 11 forks source link

Cassandane unhappy on 32-bit machine #21

Closed jasontibbitts closed 7 years ago

jasontibbitts commented 7 years ago

I noticed several additional cassandane test failures when running on a 32-bit machine. Turns out that you can't use the 'Q' type in unpack on 32-bit perl. Oops.

There's one place in Cassandane/Cyrus/Metadata.pm that uses this:

    if ($entry[2]) {
        my ($modseq, $modtime, $flags) =
            unpack("Q>Q>C", bytes::substr($entry[2], $Config{longsize}));
        if ($flags and not $tombstones) {
            next;
        }
        if ($withmdata) {
            $annot->{modseq} = $modseq;
            $annot->{flags} = $flags;
        }
    }

Some searching tells me that you can replace "Q>" with "N2" if you process the entries in the list in pairs. But I've never really gotten into Perl at this level and so this is a bit beyond me.

jasontibbitts commented 7 years ago

I instrumented that code to see what the unpack call gets passed and what it returns and it turns out that it gets called about 50 times in a regular Cassandane run and every single time, $entry[2] simply contains eight zero bytes. And on a 64-bit machine, $Config{longsize} is 8. Which kind of makes me wonder.... The first 'Q' would take eight bytes. What is supposed to be left to unpack into $modtime or $flags?

But that does tell me that to get things going today, I can probably just set modseq=0 and leave modtime and flags undefined and everything should progress unchanged.

rsto commented 7 years ago

We'll need to the modseq to test the latest features for annotations.

I'd rather update cyr_buildinfo to output if it is built for 32-bit or 64-bit. We already read cyr_buildinfo in Cassandane and update the Perl unpacking accordingly.

rsto commented 7 years ago

Sorry, that was written too hastily, I meant: we already read cyr_buildinfo in Cassandane and need to update the Perl unpacking accordingly. I'll work on this now.

rsto commented 7 years ago

@jasontibbitts i've pushed a fix to Cassandane master: 64153d7958a2091055643a9fbb3b0ca314409010

does this work out on your 32-bit system?

jasontibbitts commented 7 years ago

Unfortunately I still see a number of failures which I don't have on a 64-bit platform.

You can see a build using cyrus 3.0.1 and the current Cassandane code at https://koji.fedoraproject.org/koji/taskinfo?taskID=19785301

Note that x86_64 passes, while i686 and armv7hl fail identically. (I know there are tests which are expected to fail when run against 3.0.1; they are already excluded so that things are fine on x86_64.) Please also ignore failures relating to ppc64 and s390x; those are big-endian machines and Cyrus seems to have a separate issue there (it fails in the test suite, which is https://github.com/cyrusimap/cyrus-imapd/issues/1993).

So if you click through to the i686 build log at https://kojipkgs.fedoraproject.org//work/tasks/5307/19785307/build.log and search for "Bug3470" or something like that, you'll see the Cassandane run and all of the output it gives for the failing tests. Sadly I'm still seeing Perl exception: Invalid type 'Q' in unpack in there. That seems to account for most of the errors, but not all of them. There's also Perl exception: child process (binary reconstruct pid 30591) exited with code 75 in test_reconstruct_orphans(Cassandane::Cyrus::Quota) and test_reconstruct(Cassandane::Cyrus::Quota).

There are also some plain failures; this one looks odd:

1) test_setcontacts_importance_later(Cassandane::Cyrus::JMAPContacts)
 expected -0.1, got -0.1 at Cassandane/Cyrus/JMAPContacts.pm line 1066.
    Cassandane::Cyrus::JMAPContacts::test_setcontacts_importance_later(Cassandane::Cyrus::JMAPContacts=HASH(0xa9c88c0)) called at /usr/share/perl5/vendor_perl/Test/Unit/TestCase.pm line 75

and three others which fail with expected 10, got 0.

And looking at that patch, it can't help because if the 64-bit types are supported you can simply never call unpack("Q"). However, I note that soon at least the development version of Fedora will have the compile flag turned on so that the 64-bit types will be supported on 32-bit perl. That won't help me for Fedora 26 but I can work around it.

The failures that aren't due to unpack("Q") are also problematic, though. Sorry I didn't mention them earlier but I hadn't yet managed to dig through the entire failed test log.

jasontibbitts commented 7 years ago

So the Perl churn in Fedora rawhide is done and 32-bit Perl now supports the quad types, so the failures related to that are no longer problematic for me but might still be to someone else depending on how their Perl is built. However, there are still five tests which fail on 32-bit machines for reasons I simply do not understand, but it can't be a coincidence that they all involve "reconstruct" somehow:

The reconstruct tests all fail pretty much identically:

 Perl exception: child process (binary reconstruct pid 42954) exited with code 75
 at Cassandane/Instance.pm line 1427.
    Cassandane::Instance::__ANON__(HASH(0x5c28a584), 75) called at Cassandane/Instance.pm line 1609
    Cassandane::Instance::_handle_wait_status(Cassandane::Instance=HASH(0x59d6914c), 42954) called at Cassandane/Instance.pm line 1399
    Cassandane::Instance::reap_command(Cassandane::Instance=HASH(0x59d6914c), 42954) called at Cassandane/Instance.pm line 1389
    Cassandane::Instance::run_command(Cassandane::Instance=HASH(0x59d6914c), HASH(0x5b97fa90), "reconstruct", "user.cassandane") called at Cassandane/Cyrus/Quota.pm line 2383
    Cassandane::Cyrus::Quota::test_reconstruct(Cassandane::Cyrus::Quota=HASH(0x5b2543e4)) called at /usr/share/perl5/vendor_perl/Test/Unit/TestCase.pm line 75
    [...framework calls elided...]

The Quota tests both fail with:

 expected 10, got 0 at Cassandane/Cyrus/Reconstruct.pm line 194.
    Cassandane::Cyrus::Reconstruct::test_reconstruct_removedfile(Cassandane::Cyrus::Reconstruct=HASH(0x5b2a97dc)) called at /usr/share/perl5/vendor_perl/Test/Unit/TestCase.pm line 75
    [...framework calls elided...]

If you would like the complete output, a full build log is at https://kojipkgs.fedoraproject.org//work/tasks/8985/20038985/build.log

I note that exit code 75 is EX_TEMPFAIL, and that in at least some places in the cyrus code, that's an intentional return. So maybe what remains is now a Cyrus bug instead of a Cassandane bug but honestly I don't know enough to make that determination. If that is the case, please let me know and I'll move this over.

elliefm commented 7 years ago

if you look in 'lib/exitcodes.h' you can see that we really unhelpfully define a bunch of Cyrus exit codes to actually just be EX_TEMPFAIL (I think this is for integration with sendmail?), so "75" is about the least meaningful number that reconstruct could possibly return :/

But all the places that imap/reconstruct.c exits (either via fatal() or exit()) seem to be usage errors exiting with EC_USAGE (== EX_TEMPFAIL).

So if you fiddle with lib/exitcodes.c and change EC_USAGE to some value other than EX_TEMPFAIL, rebuild and test again, do the reconstruct tests start whinging about the new number? Or do they continue to get 75?

Alternatively, if you change each of the fatal() and exit() calls in imap/reconstruct.c to exit with a distinct value, rather than EC_USAGE, we could nail down where exactly it's barfing out...

jasontibbitts commented 7 years ago

I just set every exit and fatal call to be different positive integers. And.... no dice; it still exits 75.

So I just set up a i686 chroot and ran it all there. I did some bind mounting to get syslog to work and got some logs:

Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: SQL backend defaulting to engine 'pgsql'
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: reconstructing user.cassandane
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: IOERROR: opening index user.cassandane: System I/O error
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: create new mailbox user.cassandane
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: failed to read index header for user.cassandane
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: auditlog: create sessionid=<1642131-28240-1497631334-1-12215880127013081496> mailbox=<user.cassandane> uniqueid=<912cae39-c727-4d94-8728-db82b9860590> uidvalidity=<1497631335>
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: user.cassandane uid 4110043912 not found
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: user.cassandane uid 7 found - adding
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: user.cassandane uid 9 found - adding
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: user.cassandane uid 842282544 not found
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: user.cassandane uid 1534354292 not found
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: user.cassandane uid 5 rediscovered - appending
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: user.cassandane uid 4 rediscovered - appending
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: user.cassandane uid 3 rediscovered - appending
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: user.cassandane uid 2 rediscovered - appending
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: user.cassandane uid 1 rediscovered - appending
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: removing stale annotations for 4110043888
Jun 16 11:42:14 epithumia.math.uh.edu 1642131/reconstruct[28240]: removing stale annotations for 0

Which doesn't look bad, though I'm not sure if those "not found" things are expected. I'm also not sure why it's defaulting to pgsql. Maybe it's just picking one of the configured ones at random and that doesn't matter.

I had also run cassandane under strace and at the end there:

send(3, "<21>Jun 16 11:54:15 1654141/reconstruct[31220]: removing stale annotations for 0", 80, MSG_NOSIGNAL) = 80
write(2, "fatal error: Internal error: assertion failed: imap/annotate.c: 3390: uid\n", 74) = 74
write(1, "user.cassandane: failed to read index header\nuser.cassandane uid 4110154504 not founduser.cassandane uid 7 found - adding\nuser.cassandane uid 9 found - adding\nuser.cassandane uid 875902512 not founduser.cassandane uid 1534354292 not founduser.cassandane uid 5 rediscovered - appending\nuser.cassandane uid 4 rediscovered - appending\nuser.cassandane uid 3 rediscovered - appending\nuser.cassandane uid 2 rediscovered - appending\nuser.cassandane uid 1 rediscovered - appending\nremoving stale annotations for 4110154480\nremoving stale annotations for 0\n", 548) = 548
exit_group(75)                          = ?

So, the reason we're still getting 75 is that this is going through the assertionfailed() function in lib/assert.c, so that's one question answered.

But now we can also see just what's going on. That assert is in HIDDEN int annotate_msg_cleanup(struct mailbox *mailbox, unsigned int uid). And uid is simply being passed as zero. Too bad we don't get a backtrace. Of course with those large uid numbers and this being a 32-bit issue I'm wondering if there's an integer overflow somewhere.

Also, does Cassandane save the output from the commands it's running anywhere? Because having that immediately available would have made this a bit easier.

jasontibbitts commented 7 years ago

I used my vague recollections of C syntax and some searching to hack up a patch to get the result of the glibc backtrace() function into the output of the assert function. But it appears that's not as useful as I would have hoped, maybe because Cyrus appears to do some internal static linking. Nevertheless, here's the backtrace from that assertion.

fatal error: Internal error: assertion failed: imap/annotate.c: 3390: uid
lib/libcyrus_min.so.0(assertionfailed+0xa3) [0xf70d54c3]
lib/libcyrus_imap.so.0(+0x1e8c9) [0xf71e18c9]
lib/libcyrus_imap.so.0(mailbox_reconstruct+0x1b13) [0xf7211753]
sbin/reconstruct(+0x24da) [0x566054da]
lib/libcyrus_imap.so.0(+0x5633d) [0xf721933d]
lib/libcyrus.so.0(+0x42465) [0xf7145465]
lib/libcyrus_imap.so.0(+0x536db) [0xf72166db]
lib/libcyrus_imap.so.0(+0x5410e) [0xf721710e]
lib/libcyrus_imap.so.0(mboxlist_findallmulti+0xe8) [0xf721d928]
lib/libcyrus_imap.so.0(mboxlist_findall+0xa1) [0xf721da11]
sbin/reconstruct(main+0x663) [0x56604c53]
/lib/libc.so.6(__libc_start_main+0xf3) [0xf4e5f373]
usr/sbin/reconstruct(+0x2183) [0x56605183]
elliefm commented 7 years ago

This looks like a bug in reconstruct -- might be worth moving this issue to the cyrus-imapd repo?

If you add this patch, where does the assertion throw?

diff --git a/imap/mailbox.c b/imap/mailbox.c
index 9116bd1..713d182 100644
--- a/imap/mailbox.c
+++ b/imap/mailbox.c
@@ -6331,6 +6331,7 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)

         /* bogus annotations? */
         while (annots.pos < annots.nused && annots.found[annots.pos].uid < record.uid) {
+            assert(annots.found[annots.pos].uid != 0);
             add_found(&delannots, annots.found[annots.pos].uid, /*isarchive*/0);
             annots.pos++;
         }
@@ -6404,6 +6405,7 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)

         /* bogus annotations? */
         while (annots.pos < annots.nused && annots.found[annots.pos].uid < uid) {
+            assert(annots.found[annots.pos].uid != 0);
             add_found(&delannots, annots.found[annots.pos].uid, /*isarchive*/0);
             annots.pos++;
         }
@@ -6416,6 +6418,7 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)

     /* bogus annotations after the end? */
     while (annots.pos < annots.nused) {
+        assert(annots.found[annots.pos].uid != 0);
         add_found(&delannots, annots.found[annots.pos].uid, /*isarchive*/0);
         annots.pos++;
     }
elliefm commented 7 years ago

The SQL database to use is taken from the sql_engine setting in imapd.conf. If this is not set, the default is the first of "mysql", "pgsql" or "sqlite" that has been compiled in. So it's defaulting to pgsql because you have that compiled in, but don't have mysql compiled in.

Though, why it's using an SQL database here (as opposed to, say, twoskip) is another interesting question...

I guess it could also be a bug in cyrusdb_sql, or the pgsql backend specificially, maybe let's hold off on making a new issue for the time being, and I'll see what further insight I can gain here. I'll probably send you through some more diagnostic patches too.

elliefm commented 7 years ago

Okay, that "SQL backend defaulting to engine" thing is a total red herring -- it's just a helpful notice that prints during cyrus initialisation if cyrusdb SQL support is available (which it is for you, because you've enabled postgres as one of the backends). I hadn't seen it before myself, because I don't usually build with sql backends enabled

The annotations code won't use sql backends anyway (annotation_db setting only permits skiplist, twoskip or lmdb), so whatever the problem is here, it's not some sql translation issue.

jasontibbitts commented 7 years ago

With that patch, this is how both Quota.reconstruct and Quota.reconstruct_orphans fail:

fatal error: Internal error: assertion failed: imap/mailbox.c: 6406: annots.found[annots.pos].uid != 0

Feel free to move this to wherever you think might be appropriate, or tell me and I'll report the two remaining issues (the two Quota.reconstruct errors and the three Reconstruct.reconstruct failures) separately.

elliefm commented 7 years ago

Hmmm... still puzzled about what's going on!

Here's a newer, longer, diagnostic patch -- can you please run Quota.reconstruct and Quota.reconstruct_orphans with this applied, and collect the syslog output for each? I'm syslogging to LOG_DEBUG -- if that log level is difficult to obtain in your chroot, you could just replace LOG_DEBUG with LOG_ERR in the patch before applying it.

diff --git a/imap/mailbox.c b/imap/mailbox.c
index 713d182..03ce16e 100644
--- a/imap/mailbox.c
+++ b/imap/mailbox.c
@@ -5292,7 +5292,7 @@ EXPORTED int mailbox_copyfile(const char *from, const char *to, int nolink)
 /*                      RECONSTRUCT SUPPORT                               */
 /* ---------------------------------------------------------------------- */

-#define UIDGROW 300
+#define UIDGROW 16

 struct found_uid {
     uint32_t uid;
@@ -6179,11 +6179,11 @@ static int mailbox_wipe_index_record(struct mailbox *mailbox,
     return 0;
 }

-static int addannot_uid(const char *mailbox __attribute__((unused)),
+static int addannot_uid(const char *mailbox,
                         uint32_t uid,
-                        const char *entry __attribute__((unused)),
+                        const char *entry,
                         const char *userid __attribute__((unused)),
-                        const struct buf *value __attribute__((unused)),
+                        const struct buf *value,
                         void *rock)
 {
     struct found_uids *annots = (struct found_uids *)rock;
@@ -6192,6 +6192,8 @@ static int addannot_uid(const char *mailbox __attribute__((unused)),
      * will be together in a 'foreach' response */
     if (!annots->nused || annots->found[annots->nused-1].uid != uid) {
         /* we don't support an archive annotations DB yet */
+        syslog(LOG_DEBUG, "%s: adding found annot: uid=%u mailbox=%s %s=%.20s...",
+                          __func__, uid, mailbox, entry, buf_cstring(value));
         add_found(annots, uid, /*isarchive*/0);
     }

@@ -6242,6 +6244,20 @@ out:
     return r;
 }

+static void dump_found_uids(const struct found_uids *found, const char *d1, const char *d2)
+{
+    unsigned z;
+    syslog(LOG_DEBUG, "%s: %s: nalloc=%u nused=%u pos=%u",
+                      d1, d2, found->nalloc, found->nused, found->pos);
+    for (z = 0; z < found->nused; z++) {
+        syslog(LOG_DEBUG, "%s: %s: found uid=%u isarchive=%u",
+                          d1, d2, found->found[z].uid, found->found[z].isarchive);
+    }
+    for (; z < found->nalloc; z++) {
+        syslog(LOG_DEBUG, "%s: %s: empty uid=%u isarchive=%u",
+                          d1, d2, found->found[z].uid, found->found[z].isarchive);
+    }
+}

 /*
  * Reconstruct the single mailbox named 'name'
@@ -6308,9 +6324,11 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)

     r = find_files(mailbox, &files, flags);
     if (r) goto close;
+    dump_found_uids(&files, __func__, "find files");

     r = find_annots(mailbox, &annots);
     if (r) goto close;
+    dump_found_uids(&annots, __func__, "find annots");

     uint32_t recno;
     struct index_record record;
@@ -6330,6 +6348,7 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)
         last_seen_uid = record.uid;

         /* bogus annotations? */
+        syslog(LOG_DEBUG, "%s: want to delete annotations with uid < %u", __func__, record.uid);
         while (annots.pos < annots.nused && annots.found[annots.pos].uid < record.uid) {
             assert(annots.found[annots.pos].uid != 0);
             add_found(&delannots, annots.found[annots.pos].uid, /*isarchive*/0);
@@ -6338,10 +6357,12 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)

         /* skip over current */
         while (annots.pos < annots.nused && annots.found[annots.pos].uid == record.uid) {
+            syslog(LOG_DEBUG, "%s: skipping annotation with uid == %u", __func__, record.uid);
             annots.pos++;
         }

         /* lower UID file exists */
+        syslog(LOG_DEBUG, "%s: discover files with uid < %u", __func__, record.uid);
         while (files.pos < files.nused && files.found[files.pos].uid < record.uid) {
             add_found(&discovered, files.found[files.pos].uid, files.found[files.pos].isarchive);
             files.pos++;
@@ -6387,6 +6408,7 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)
     }

     /* add discovered messages before last_uid to the list in order */
+    syslog(LOG_DEBUG, "%s: discover files with uid <= %u", __func__, mailbox->i.last_uid);
     while (files.pos < files.nused && files.found[files.pos].uid <= mailbox->i.last_uid) {
         add_found(&discovered, files.found[files.pos].uid, files.found[files.pos].isarchive);
         files.pos++;
@@ -6404,6 +6426,7 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)
         /* we can keep this annotation too... */

         /* bogus annotations? */
+        syslog(LOG_DEBUG, "%s: want to delete annotations with uid < %u", __func__, uid);
         while (annots.pos < annots.nused && annots.found[annots.pos].uid < uid) {
             assert(annots.found[annots.pos].uid != 0);
             add_found(&delannots, annots.found[annots.pos].uid, /*isarchive*/0);
@@ -6412,11 +6435,13 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)

         /* skip over current */
         while (annots.pos < annots.nused && annots.found[annots.pos].uid == uid) {
+            syslog(LOG_DEBUG, "%s: skipping annotation with uid == %u", __func__, uid);
             annots.pos++;
         }
     }

     /* bogus annotations after the end? */
+    syslog(LOG_DEBUG, "%s: want to delete remaining annotations", __func__);
     while (annots.pos < annots.nused) {
         assert(annots.found[annots.pos].uid != 0);
         add_found(&delannots, annots.found[annots.pos].uid, /*isarchive*/0);
@@ -6424,6 +6449,7 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)
     }

     /* handle new list - note, we don't copy annotations for these */
+    dump_found_uids(&discovered, __func__, "discovered files");
     while (discovered.pos < discovered.nused) {
         r = mailbox_reconstruct_append(mailbox, discovered.found[discovered.pos].uid,
                                        discovered.found[discovered.pos].isarchive, flags);
@@ -6431,6 +6457,7 @@ EXPORTED int mailbox_reconstruct(const char *name, int flags)
         discovered.pos++;
     }

+    dump_found_uids(&delannots, __func__, "delannots");
     if (delannots.nused) {
         r = reconstruct_delannots(mailbox, &delannots, flags);
         if (r) goto close;
elliefm commented 7 years ago

If it's easier for you to grab these patches from git, rather than c&p from the web, my work-in-progress on this is here: https://github.com/elliefm/cyrus-imapd/tree/v30/reconstruct-be32

elliefm commented 7 years ago

(It would be fairly trivial to avoid the uid=0 problem by just checking for that, but I'm trying to understand how they're getting in there. I'm also really bothered by those big numbers when these tests only create ~10 uids!)

jasontibbitts commented 7 years ago

It takes some doing to get syslog working but it is possible. I also note that mock is in Debian, apparently with our configs already in place, so if I'm correct in my assumption that you use Debian, it should be relatively easy to replicate the environment in which I'm getting these test failures.

Here's what just the reconstruct process logs for Quota.reconstruct:

1516581/reconstruct[11395]: SQL backend defaulting to engine 'pgsql'
1516581/reconstruct[11395]: reconstructing user.cassandane
1516581/reconstruct[11395]: IOERROR: opening index user.cassandane: System I/O error
1516581/reconstruct[11395]: create new mailbox user.cassandane
1516581/reconstruct[11395]: failed to read index header for user.cassandane
1516581/reconstruct[11395]: auditlog: create sessionid=<1516581-11395-1497971819-1-18000607619134041552> mailbox=<user.cassandane> uniqueid=<565ee95a-6de7-4be0-acd4-f77773fd2eb9> uidvalidity=<1497971820>
1516581/reconstruct[11395]: mailbox_reconstruct: find files: nalloc=16 nused=10 pos=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: found uid=4110088784 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: found uid=0 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: found uid=6 isarchive=1
1516581/reconstruct[11395]: mailbox_reconstruct: find files: found uid=8 isarchive=1
1516581/reconstruct[11395]: mailbox_reconstruct: find files: found uid=10 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: found uid=5 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: found uid=4 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: found uid=3 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: found uid=2 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: found uid=1 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1516581/reconstruct[11395]: addannot_uid: adding found annot: uid=1 mailbox=user.cassandane /altsubject= benzoxy benzoxy ben...
1516581/reconstruct[11395]: addannot_uid: adding found annot: uid=10 mailbox=user.cassandane /altsubject= bdellid bdellid bde...
1516581/reconstruct[11395]: addannot_uid: adding found annot: uid=2 mailbox=user.cassandane /altsubject= anagep anagep anage...
1516581/reconstruct[11395]: addannot_uid: adding found annot: uid=3 mailbox=user.cassandane /altsubject= advert advert adver...
1516581/reconstruct[11395]: addannot_uid: adding found annot: uid=4 mailbox=user.cassandane /altsubject= auctor auctor aucto...
1516581/reconstruct[11395]: addannot_uid: adding found annot: uid=5 mailbox=user.cassandane /altsubject= aloofe aloofe aloof...
1516581/reconstruct[11395]: addannot_uid: adding found annot: uid=6 mailbox=user.cassandane /altsubject= berio berio berio b...
1516581/reconstruct[11395]: addannot_uid: adding found annot: uid=7 mailbox=user.cassandane /altsubject= andaste andaste and...
1516581/reconstruct[11395]: addannot_uid: adding found annot: uid=8 mailbox=user.cassandane /altsubject= amadi amadi amadi a...
1516581/reconstruct[11395]: addannot_uid: adding found annot: uid=9 mailbox=user.cassandane /altsubject= belier belier belie...
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: nalloc=16 nused=10 pos=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: found uid=4110088624 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: found uid=0 isarchive=1
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: found uid=2 isarchive=1
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: found uid=4 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: found uid=1465875408 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: found uid=5 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: found uid=6 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: found uid=7 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: found uid=8 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: found uid=9 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: empty uid=1634040167 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: empty uid=4294967295 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: empty uid=808139056 isarchive=1
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: empty uid=4294967295 isarchive=1
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: empty uid=1952543298 isarchive=1
1516581/reconstruct[11395]: mailbox_reconstruct: find annots: empty uid=1393189238 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: discover files with uid <= 0
1516581/reconstruct[11395]: user.cassandane uid 4110088784 not found
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete annotations with uid < 4110088784
1516581/reconstruct[11395]: user.cassandane uid 0 not found
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete annotations with uid < 0
1516581/reconstruct[11395]: user.cassandane uid 6 found - adding
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete annotations with uid < 6
1516581/reconstruct[11395]: user.cassandane uid 8 found - adding
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete annotations with uid < 8
1516581/reconstruct[11395]: user.cassandane uid 10 found - adding
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete annotations with uid < 10
1516581/reconstruct[11395]: user.cassandane uid 5 rediscovered - appending
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete annotations with uid < 5
1516581/reconstruct[11395]: user.cassandane uid 4 rediscovered - appending
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete annotations with uid < 4
1516581/reconstruct[11395]: user.cassandane uid 3 rediscovered - appending
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete annotations with uid < 3
1516581/reconstruct[11395]: user.cassandane uid 2 rediscovered - appending
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete annotations with uid < 2
1516581/reconstruct[11395]: user.cassandane uid 1 rediscovered - appending
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete annotations with uid < 1
1516581/reconstruct[11395]: mailbox_reconstruct: want to delete remaining annotations
1516581/reconstruct[11395]: mailbox_reconstruct: discovered files: nalloc=0 nused=0 pos=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: nalloc=16 nused=10 pos=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: found uid=4110088624 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: found uid=0 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: found uid=2 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: found uid=4 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: found uid=1465875408 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: found uid=5 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: found uid=6 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: found uid=7 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: found uid=8 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: found uid=9 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: empty uid=1701732708 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: empty uid=4294967295 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: empty uid=1633905010 isarchive=1
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: empty uid=4294967295 isarchive=0
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: empty uid=808126802 isarchive=1
1516581/reconstruct[11395]: mailbox_reconstruct: delannots: empty uid=825570608 isarchive=0
1516581/reconstruct[11395]: removing stale annotations for 4110088624
1516581/reconstruct[11395]: removing stale annotations for 0

And for Quota.reconstruct_orphans:

1526111/reconstruct[12465]: SQL backend defaulting to engine 'pgsql'
1526111/reconstruct[12465]: reconstructing user.cassandane
1526111/reconstruct[12465]: IOERROR: opening index user.cassandane: System I/O error
1526111/reconstruct[12465]: create new mailbox user.cassandane
1526111/reconstruct[12465]: failed to read index header for user.cassandane
1526111/reconstruct[12465]: auditlog: create sessionid=<1526111-12465-1497972372-1-412116222712523469> mailbox=<user.cassandane> uniqueid=<29662999-2ed3-466d-9fa4-034c41f00c73> uidvalidity=<1497972373>
1526111/reconstruct[12465]: mailbox_reconstruct: find files: nalloc=16 nused=8 pos=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: found uid=4109982288 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: found uid=0 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: found uid=8 isarchive=1
1526111/reconstruct[12465]: mailbox_reconstruct: find files: found uid=10 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: found uid=5 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: found uid=4 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: found uid=3 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: found uid=1 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: empty uid=0 isarchive=1
1526111/reconstruct[12465]: mailbox_reconstruct: find files: empty uid=1472947520 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find files: empty uid=0 isarchive=0
1526111/reconstruct[12465]: addannot_uid: adding found annot: uid=1 mailbox=user.cassandane /altsubject= awane awane awane a...
1526111/reconstruct[12465]: addannot_uid: adding found annot: uid=10 mailbox=user.cassandane /altsubject= alae alae alae alae...
1526111/reconstruct[12465]: addannot_uid: adding found annot: uid=2 mailbox=user.cassandane /altsubject= baccate baccate bac...
1526111/reconstruct[12465]: addannot_uid: adding found annot: uid=3 mailbox=user.cassandane /altsubject= albina albina albin...
1526111/reconstruct[12465]: addannot_uid: adding found annot: uid=4 mailbox=user.cassandane /altsubject= aln aln aln aln aln...
1526111/reconstruct[12465]: addannot_uid: adding found annot: uid=5 mailbox=user.cassandane /altsubject= berger berger berge...
1526111/reconstruct[12465]: addannot_uid: adding found annot: uid=6 mailbox=user.cassandane /altsubject= belen belen belen b...
1526111/reconstruct[12465]: addannot_uid: adding found annot: uid=7 mailbox=user.cassandane /altsubject= bandhor bandhor ban...
1526111/reconstruct[12465]: addannot_uid: adding found annot: uid=8 mailbox=user.cassandane /altsubject= basnat basnat basna...
1526111/reconstruct[12465]: addannot_uid: adding found annot: uid=9 mailbox=user.cassandane /altsubject= begild begild begil...
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: nalloc=16 nused=10 pos=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: found uid=4109982128 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: found uid=0 isarchive=1
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: found uid=2 isarchive=1
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: found uid=4 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: found uid=1472949200 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: found uid=5 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: found uid=6 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: found uid=7 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: found uid=8 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: found uid=9 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: empty uid=1634040167 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: empty uid=4294967295 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: empty uid=808139056 isarchive=1
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: empty uid=4294967295 isarchive=1
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: empty uid=1952543298 isarchive=1
1526111/reconstruct[12465]: mailbox_reconstruct: find annots: empty uid=1393189238 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: discover files with uid <= 0
1526111/reconstruct[12465]: user.cassandane uid 4109982288 not found
1526111/reconstruct[12465]: mailbox_reconstruct: want to delete annotations with uid < 4109982288
1526111/reconstruct[12465]: user.cassandane uid 0 not found
1526111/reconstruct[12465]: mailbox_reconstruct: want to delete annotations with uid < 0
1526111/reconstruct[12465]: user.cassandane uid 8 found - adding
1526111/reconstruct[12465]: mailbox_reconstruct: want to delete annotations with uid < 8
1526111/reconstruct[12465]: user.cassandane uid 10 found - adding
1526111/reconstruct[12465]: mailbox_reconstruct: want to delete annotations with uid < 10
1526111/reconstruct[12465]: user.cassandane uid 5 rediscovered - appending
1526111/reconstruct[12465]: mailbox_reconstruct: want to delete annotations with uid < 5
1526111/reconstruct[12465]: user.cassandane uid 4 rediscovered - appending
1526111/reconstruct[12465]: mailbox_reconstruct: want to delete annotations with uid < 4
1526111/reconstruct[12465]: user.cassandane uid 3 rediscovered - appending
1526111/reconstruct[12465]: mailbox_reconstruct: want to delete annotations with uid < 3
1526111/reconstruct[12465]: user.cassandane uid 1 rediscovered - appending
1526111/reconstruct[12465]: mailbox_reconstruct: want to delete annotations with uid < 1
1526111/reconstruct[12465]: mailbox_reconstruct: want to delete remaining annotations
1526111/reconstruct[12465]: mailbox_reconstruct: discovered files: nalloc=0 nused=0 pos=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: nalloc=16 nused=10 pos=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: found uid=4109982128 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: found uid=0 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: found uid=2 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: found uid=4 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: found uid=1472949200 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: found uid=5 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: found uid=6 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: found uid=7 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: found uid=8 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: found uid=9 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: empty uid=1701732708 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: empty uid=4294967295 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: empty uid=1633905010 isarchive=1
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: empty uid=4294967295 isarchive=0
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: empty uid=808126802 isarchive=1
1526111/reconstruct[12465]: mailbox_reconstruct: delannots: empty uid=825570608 isarchive=0
1526111/reconstruct[12465]: removing stale annotations for 4109982128
1526111/reconstruct[12465]: removing stale annotations for 0
elliefm commented 7 years ago

Another quick patch with more diagnostics:

diff --git a/imap/mailbox.c b/imap/mailbox.c
index df53ea8..b1d9e04 100644
--- a/imap/mailbox.c
+++ b/imap/mailbox.c
@@ -5306,6 +5306,8 @@ struct found_uids {
 #define FOUND_UIDS_INITIALIZER \
     { NULL, 0, 0, 0 }

+static void dump_found_uids(const struct found_uids *found, const char *d1, const char *d2);
+
 static int sort_found(const void *a, const void *b)
 {
     struct found_uid *fa = (struct found_uid *)a;
@@ -5396,6 +5398,8 @@ static int find_files(struct mailbox *mailbox, struct found_uids *files,
             if (!strncmp(p, "cyrus.", 6)) continue; /* cyrus.* files */

             r = parse_datafilename(p, &uid);
+            syslog(LOG_DEBUG, "%s: parsedatafilename(%s) => %u (r = %i)",
+                              __func__, p, uid, r);

             if (r) {
                 /* check if it's a directory */
@@ -5424,6 +5428,7 @@ static int find_files(struct mailbox *mailbox, struct found_uids *files,
         closedir(dirp);
     }

+    dump_found_uids(files, __func__, "before qsort");
     /* make sure UIDs are sorted for comparison */
     qsort(files->found, files->nused, sizeof(unsigned long), sort_found);

@@ -6207,6 +6212,7 @@ static int find_annots(struct mailbox *mailbox, struct found_uids *annots)
                              addannot_uid, annots);
     if (r) return r;

+    dump_found_uids(annots, __func__, "before qsort");
     /* make sure UIDs are sorted for comparison */
     qsort(annots->found, annots->nused, sizeof(unsigned long), sort_found);
elliefm commented 7 years ago

Output from tibbs: https://paste.fedoraproject.org/paste/BGKE~Qh972z8~VJ7UUGwdA

elliefm commented 7 years ago

I think this is the fix:

diff --git a/imap/mailbox.c b/imap/mailbox.c
index 2fffc2b..7aa0c7b 100644
--- a/imap/mailbox.c
+++ b/imap/mailbox.c
@@ -5431,7 +5431,7 @@ static int find_files(struct mailbox *mailbox, struct found_uids *files,

     dump_found_uids(files, __func__, "before qsort");
     /* make sure UIDs are sorted for comparison */
-    qsort(files->found, files->nused, sizeof(unsigned long), sort_found);
+    qsort(files->found, files->nused, sizeof(files->found[0]), sort_found);

     strarray_fini(&paths);

@@ -6215,7 +6215,7 @@ static int find_annots(struct mailbox *mailbox, struct found_uids *annots)

     dump_found_uids(annots, __func__, "before qsort");
     /* make sure UIDs are sorted for comparison */
-    qsort(annots->found, annots->nused, sizeof(unsigned long), sort_found);
+    qsort(annots->found, annots->nused, sizeof(annots->found[0]), sort_found);

     return 0;
 }
elliefm commented 7 years ago

Fixed on 3.0 and master

elliefm commented 7 years ago

I think this is now fixed, yeah?