cyrusimap / cyrus-imapd

Cyrus IMAP is an email, contacts and calendar server
http://cyrusimap.org
Other
550 stars 151 forks source link

backend send incomplete activate to mupdate master for old mailbox name on renaming a special-use folder #2994

Open MichaelMenge opened 4 years ago

MichaelMenge commented 4 years ago

2599 turned out to be two bug (or one bug and one feature request)

this is a split of.

How to Reproduce the bug: In a cyrus murder configuration rename a special-use folder. Testet with cyrus-imapd 3.0.13-14-gf712d1, cyrus-imapd master (aka 3.3.0-dev0-119-g38b20fc)

Diagnosis:

relevant content of Mupdate-Master mailboxes.db before rename

# ctl_mboxlist -C /etc/imapd_mu.conf -d | grep SpecialUse
user.LoginID.SpecialUse 1 testma1.mail.localhost!ssd LoginID    lrswipkxtecda   cyrus   lrswipkxtecda

relevant content of backend mailboxes.db before rename

# ctl_mboxlist -C /etc/imapd_be.conf -d | grep SpecialUse
user.LoginID.SpecialUse 0 ssd LoginID   lrswipkxtecda   cyrus   lrswipkxtecda

IMAP rename

# imtest -a LoginID -s testma1
a LIST (SPECIAL-USE) "" "*"
* LIST (\HasNoChildren \Sent) "/" SpecialUse
a OK Completed (0.000 secs 66 calls)
b RENAME SpecialUse SpecialUseRenamed
* OK rename SpecialUse SpecialUseRenamed
b OK Completed
c LIST (SPECIAL-USE) "" "*"
* LIST (\HasNoChildren \Sent) "/" SpecialUseRenamed
c OK Completed (0.010 secs 66 calls)
d logout
* BYE LOGOUT received
d OK Completed
Connection closed.

relevant content of mupdate master mailboxes.db after rename

# ctl_mboxlist -C /etc/imapd_mu.conf -d | grep SpecialUse
user.LoginID.SpecialUse 1 testma1.mail.localhost!
user.LoginID.SpecialUseRenamed  1 testma1.mail.localhost!ssd LoginID    lrswipkxtecda   cyrus   lrswipkxtecda

relevant content of backend mailboxes.db after rename

# ctl_mboxlist -C /etc/imapd_be.conf -d | grep SpecialUse
user.LoginID.SpecialUse 16 (null)
user.LoginID.SpecialUseRenamed  0 ssd LoginID   lrswipkxtecda   cyrus   lrswipkxtecda

mupdate master telemetry logs

# cat mupdate-1582036804.*
---------- cyrus-backend Tue Feb 18 15:40:04 2020

<1582036804<X0 DELETE {23+}
user.LoginID.SpecialUse
>1582036804>X0 OK "done"
<1582036804<X1 RESERVE {30+}
user.LoginID.SpecialUseRenamed {26+}
testma1.mail.localhost!ssd
>1582036804>X1 OK "done"
<1582036804<X2 ACTIVATE {30+}
user.LoginID.SpecialUseRenamed {26+}
testma1.mail.localhost!ssd {42+}
LoginID lrswipkxtecda   cyrus   lrswipkxtecda
>1582036804>X2 OK "done"
<1582036804<Q01 LOGOUT
>1582036804>Q01 OK "bye-bye"
---------- cyrus-backend Tue Feb 18 15:40:04 2020

<1582036804<X0 ACTIVATE {23+}
user.LoginID.SpecialUse {23+}
testma1.mail.localhost! {0+}

>1582036804>X0 OK "done"
<1582036804<Q01 LOGOUT
>1582036804>Q01 OK "bye-bye"

I could trace the incomplete ACTIVATE back to imap/mboxlist.c @@ mboxlist_foldermodseq_dirty() where mboxlist_update(mbentry, 0); is called the second parameter is "localonly" and is used to distinguish cases where mupdate mailboxes.db needs to be updated.

mboxlist_foldermodseq_dirty is only called from two places.

  1. imap/annotate.c @@ annotation_set_mailboxopt()
  2. imap/mailbox.c @@ mailbox_annot_changed()

both handel changes of annotations. As mupdate master does not store information about annotation I suspect that the mboxlist_update in mboxlist_foldermodseq_dirty() can be safely changed to mboxlist_update(mbentry, 1); My test show that this would solve the problem.

In the review of the initial pull requests #2977 and #2978 @brong had concerns about the localonly change

So that's definitely a bug worth fixing on its own - but probably not by just doing 'localonly' because that will get it out of sync.

elliefm commented 4 years ago

wip cassandane test here, but haven't successfully reproduced the problem yet: https://github.com/cyrusimap/cassandane/compare/master...elliefm:2994-murder-rename-specialuse

MichaelMenge commented 4 years ago

@elliefm any progress?

elliefm commented 4 years ago

I'm just looking at this again now, and I've suddenly noticed that your examples are using the Cyrus default of altnamespace=yes, whereas my unfinished test is using the Cassandane default of altnamespace=no.

I wonder if that will prove to be an important detail -- it may explain why I wasn't able to make it misbehave last time I looked into this.

Here's what I see at the moment, without altnamespace:

---------- mupduser Mon May 18 16:43:56 2020

<1589784236<X0 DELETE {19+}
user.cassandane.foo
>1589784236>X0 OK "done"
<1589784236<X1 RESERVE {19+}
user.cassandane.bar {22+}
localhost:9101!default
>1589784236>X1 OK "done"
<1589784236<X2 ACTIVATE {19+}
user.cassandane.bar {22+}
localhost:9101!default {56+}
cassandane      lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p       
>1589784236>X2 OK "done"
<1589784236<Q01 LOGOUT
>1589784236>Q01 OK "bye-bye"

I notice your bad ACTIVATE turns up in a second telemetry file, with the same timestamp as the RENAME. Interestingly, I do not have a second file with the same timestamp as the RENAME?? So I can't check whether the other ACTIVATE is bad because it didn't exist?

So, I still think I'm missing some detail in my reproduction, which is causing it to work cleanly instead of reproducing the problem.

The next thing I'll try is making a duplicate of the existing unfinished test, but which uses altnamespace, and see if that's the trick to making this break successfully.

elliefm commented 4 years ago

I've duplicated the test with altnamespace enabled (same link as previously), and the end result is the same:

Additionally, when I look at the mupdate telemetry Iogs as before, there is only one mupdate log for the rename, and the ACTIVATE looks complete.

Any idea what I'm missing here?

Do I need to dump the contents of each server's mailboxes.db as well? I was under the impression the problem could be observed over IMAP after the rename, am I mistaken?

MichaelMenge commented 4 years ago

@elliefm

Quoting my mail to you an Bron from Mar. 18 2020 Re: open issue and pull requests for cyrus 3.0 and 3.2

  1. the Folder with the incomplete mailboxes.db is not show in the imap LIST "" "OldFoldername" on the frontend, but the entry is in the mailboxes.db on the frontend. So on the forntend you will only see the problem if you try to create a folder the oldfoldername or rename the folder back.

Though i have not tested this on the mupdate master your observation is consistent with my observation for the frontend IMAP session. I suspect the mailbox is not shown in the IMAP or MUPDATE protocol because the mailbox entry is broken. But creating a new entry will fail, because the key aka mailboxname already exists

So to make the Problem visible you have the following Options

elliefm commented 4 years ago

Ahhhh okay, I think I understand now!

So the problem isn't that the rename doesn't work successfully; but rather, that the rename appears to succeed, but the old name isn't released properly. And so you start seeing problems only when attempting to re-use the old name for something.

I'll start by dumping the mailboxes.db on both stores and visually inspecting, and see where we go from there...

elliefm commented 4 years ago

Okay!

Before rename (trimmed for clarity),

frontend:

C: 6 list "" "*" RETURN (SUBSCRIBED SPECIAL-USE)
S: * LIST (\HasChildren) "." INBOX
S: * LIST (\Subscribed \HasNoChildren \Drafts) "." INBOX.foo
S: 6 OK Completed (0.000 secs 2 calls)

user.cassandane 1 localhost:9101!default cassandane lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   
user.cassandane.foo 1 localhost:9101!default cassandane lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   

backend:

C: 3 list "" "*" RETURN (SUBSCRIBED SPECIAL-USE)
S: * LIST (\HasChildren) "." INBOX
S: * LIST (\Subscribed \HasNoChildren \Drafts) "." INBOX.foo
S: 3 OK Completed (0.000 secs 2 calls)

user.cassandane 0 default cassandane    lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   
user.cassandane.foo 0 default cassandane    lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   

Rename:

C: 7 rename INBOX.foo INBOX.bar
S: * OK rename INBOX.foo INBOX.bar
S: 7 OK Completed

After rename,

frontend:

C: 8 list "" "*" RETURN (SUBSCRIBED SPECIAL-USE)
S: * LIST (\HasChildren) "." INBOX
S: * LIST (\HasNoChildren \Drafts) "." INBOX.bar
S: 8 OK Completed (0.000 secs 2 calls)

user.cassandane 1 localhost:9101!default cassandane lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   
user.cassandane.bar 1 localhost:9101!default cassandane lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   

backend:

C: 4 list "" "*" RETURN (SUBSCRIBED SPECIAL-USE)
S: * LIST (\HasChildren) "." INBOX
S: * LIST (\HasNoChildren \Drafts) "." INBOX.bar
S: 4 OK Completed (0.000 secs 2 calls)

user.cassandane 0 default cassandane    lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   
user.cassandane.bar 0 default cassandane    lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   
user.cassandane.foo 16 (null) 

So the big difference I'm seeing here is that the backend has a tombstone record (MBTYPE_DELETED == 16) for the old name, but the frontend does not.

I had another look at the output in the OP text, and your backend also has the tombstone record, whereas your frontend has a junk record. My front end doesn't have a junk record, but it also doesn't have a tombstone.

This test run was with the cyrus master branch. I'll try again in a moment with a 3.0 build and see if it's different...

I'm not sure whether it would make sense for the frontend mailboxes.db to hold a tombstone record (it never had the data, so it would never need to detect and replicate a deletion), but it definitely shouldn't have a junk entry! So I wonder if the bug is in the tombstone handling, and I wonder if it's already fixed on master?

elliefm commented 4 years ago

Same test but with 3.0, and post-rename frontend mailboxes.db contains:

user.cassandane 1 localhost:9101!default cassandane lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   
user.cassandane.bar 1 localhost:9101!default cassandane lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   
user.cassandane.foo 1 localhost:9101! 

So! I've finally reproduced this! Gonna do the same test against a 3.2 build and see what happens...

user.cassandane 1 localhost:9101!default cassandane lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   
user.cassandane.bar 1 localhost:9101!default cassandane lrswipkxtecdan  admin   lrswipkxtecdan  anyone  p   

So 3.2 and master seem to be sane so far by visual inspection, but 3.0 is broken. Next, I'll try creating a new folder with the same name as the original one, and see whether it succeeds, or if it fails, what the failure looks like...

elliefm commented 4 years ago

When I have the test create a new mailbox with the original name, 3.0 fails:

C: 9 create INBOX.foo
S: 9 NO unable to reserve mailbox on mupdate server

but 3.2 succeeds and both mailboxes.db looks correct.

elliefm commented 4 years ago

Oh, these latest tests are also with altnamespace=no, so I suppose I should also check with altnamespace=yes, just in case

elliefm commented 4 years ago

The cassandane tests are now automated (no visual-inspection required), but still on my branch only for now.

Do we know whether this particular problem ("rename leaves the original folder in an unreusable state") is exclusive to folders with a special-use flag, or does it happen for any folder? I know we started investigating this as being exclusive to special-use, but then we also discovered we were investigating two things at once and split it up... so I wonder if this particular problem is more general and we're not noticing cause we're only looking at special-use.

I think the next thing I'm gonna do here is add the same tests again but without the special-use flag.

MichaelMenge commented 4 years ago

As far as I can tell normal folder are not affected by this bug and I would have received many reports from our users if the where.

We don't use any other annotations by default, and very few users if any add other annotations, so this case is probably not tested by our users

elliefm commented 4 years ago

Just did a quick copy-paste-remove-specialuse from those two tests, and the new tests both pass on 3.0 and everything since.

So I think we've definitely confirmed that:

I wonder if the 3.0 bug is special-use specific, or if it's "any folder with an annotation"? I don't really know much about annotations in the general case, but I guess that's the next thing I'll need to test.

elliefm commented 4 years ago

Testing with Cyrus master, for now:

front end:

C: 5 setmetadata INBOX.foo ("/private/comment" "tasteful")
S: 5 OK Completed
C: 7 getmetadata "*" ("/private/comment")
S: 7 OK Completed

So it lets me set it, but can't show me it, and there are no untagged metadata responses at all.

back end:

C: 4 getmetadata "*" ("/private/comment")
S: * METADATA INBOX ("/private/comment" NIL)
S: * METADATA INBOX.foo ("/private/comment" NIL)
S: 4 OK Completed

So when I ask the backend, I do get untagged metadata responses for the matching folders, but both of them report that particular key as having no value.

The whole test succeeds, but I'm not yet testing the metadata values, only logging them for visual inspection...


Doing the same with 3.0:

front end:

C: 5 setmetadata INBOX.foo ("/private/comment" "tasteful")
S: 5 OK Completed
C: 7 getmetadata "*" ("/private/comment")
S: * ANNOTATION INBOX "/comment" ("value.priv" NIL)
S: * ANNOTATION INBOX.foo "/comment" ("value.priv" "tasteful")
S: 7 OK Completed

Curiously, in 3.0 the front end does find the annotation, but the untagged responses say ANNOTATION rather than METADATA(?!), and also trim the leading "/private" off the key.

back end:

C: 4 getmetadata "*" ("/private/comment")
S: * METADATA INBOX ("/private/comment" NIL)
S: * METADATA INBOX.foo ("/private/comment" "tasteful")
S: 4 OK Completed

So asking the backend directly a) finds the correct value for the key, and b) returns the values as METADATA (including the "/private" prefix) rather than ANNOTATION... the plot thickens.

After the RENAME:

front end:

C: 10 getmetadata "*" ("/private/comment")
S: * ANNOTATION INBOX "/comment" ("value.priv" NIL)
S: * ANNOTATION INBOX.bar "/comment" ("value.priv" "tasteful")
S: 10 OK Completed

Same as before wrt the response formatting, but my "tasteful" comment has correctly moved from INBOX.foo to INBOX.bar, and the nonexistent INBOX.foo is not in the results.

back end:

C: 6 getmetadata "*" ("/private/comment")
S: * METADATA INBOX ("/private/comment" NIL)
S: * METADATA INBOX.bar ("/private/comment" "tasteful")
S: 6 OK Completed

Also same as before wrt formatting. The comment has followed the rename, and the old name is also gone. Good, let's keep going...

front end:

C: 11 create INBOX.foo
S: 11 NO unable to reserve mailbox on mupdate server

Uh oh!

So it looks like, quite aside from the weirdness of the getmetadata responses, the "cannot re-use a name that previously existed" problem is not just special-use folders, but also affects regular folders that have annotations. Interesting!

I still don't really know what changed in 3.2 and master that fixed the special-use behaviour. I was hoping to narrow it down, as either being metadata-generally, or special-use-specifically.

But I feel like I have instead found some new bug on master (and probably 3.2, I haven't tested) that affects metadata, but not special-use. I wonder if the metadata bug was introduced in the fix for the special-use one, or if they're completely unrelated. Uggh.

MichaelMenge commented 3 years ago

any news on this issue?