Open elliefm opened 4 years ago
Need to consider whether mailboxes like "#calendars" and "#addressbooks" are suitable split points, or should always be treated as part of their parent; but also how to handle them if we want one-backup-per-top-level-shared but some of the top-level-shareds are like "#calendars", "#addressbooks", etc!
WIP is here, just the trivial (0 or 1) implementation so far: https://github.com/elliefm/cyrus-imapd/tree/v31/2915-shared-mailbox-backup-granularity
I'm working against master, but it should backport trivially onto 3.0 since the backups code is the same(?) on both.
Tripped over #2920 while testing this, so that's fixed now
Added a commit that takes a first stab at supporting values >1, but don't have even a basic "this should work" test for it yet. I feel like there are also some pathological edge cases I need to poke it with, but I don't quite see the shape of them yet.
I (finally) spent some time with this patch, applied on 3.2.5, and can make the following observations with backup_shared_mailbox_granularity = 1
For XBACKUP single_mailbox backup_destination
, the backup filename is constructed from the first component of the mailbox as expected. Example:
C xbackup accounts@polyfoam.com.au rsync
OK MAILBOX accounts@polyfoam.com.au C OK Completed
[...] -rw------- 1 cyrus mail 141846087 Dec 28 19:10 %SHARED.accounts_OeglL2 -rw------- 1 cyrus mail 344064 Dec 28 19:10 %SHARED.accounts_OeglL2.index [...]
Note that the destination file is made from `accounts` and not `accounts@polyfoam.com.au` which is not a problem for me, but might be an issue if a server has multiple domains with the same before-the-@ part. For consistency with plain user backup files, which are of the form `debbiep@polyfoam.com.au_XXXXXX`, maybe the domain part should be retained.
For XBACKUP * backup_destination
, after all the user accounts are backed up, the first shared mailbox name is used for the filename (as above), and then that same filename is used for all subsequent mailboxes backed up in the same session. Example:
C XBACKUP * rsync
OK USER aaa@polyfoam.com.au [....]
OK USER zzz@polyfoam.com.au
OK MAILBOX #addressbooks
OK MAILBOX #addressbooks/Bridgewater
OK MAILBOX #addressbooks/Dandenong
OK MAILBOX #addressbooks/Darra
OK MAILBOX #addressbooks/Moorebank
OK MAILBOX #calendars
OK MAILBOX #calendars/Leave
OK MAILBOX #calendars/Payroll
OK MAILBOX Archive@polyfoam.com.au [...]
-rw------- 1 cyrus mail 4606397314 Dec 28 18:00 %SHARED.#addressbooks_H8LWdC -rw------- 1 cyrus mail 3911680 Dec 28 18:00 %SHARED.#addressbooks_H8LWdC.index
%SHARED.#addressbooks /home/mail-3175-1/cyrus-backup/partitions/default/q/%SHARED.#addressbooks_H8LWdC [Others like #calendars and Archive@polyfoam.com.au are not listed]
uniqueid last append date mboxname dmm8u7i5zbew82q2n6u30dwn 1970-01-01 10:00:00 #addressbooks 6ldw5m6tc6o0mx36zns4r06k 2019-10-28 15:48:13 #addressbooks.Bridgewater 3ygf1zzj6mhnn1qwv03apl3r 2019-12-13 10:21:18 #addressbooks.Dandenong 5wlyn70g35iqevw3by0u4zhc 1970-01-01 10:00:00 #addressbooks.Darra 9uoazfngs3324mp4smt2e5dk 1970-01-01 10:00:00 #addressbooks.Moorebank 59bsdtxxoeelfc0uis76of76 1970-01-01 10:00:00 #calendars ooyg21vzeydj9zmk3zoccx28 2020-12-24 12:06:37 #calendars.Leave 39ogsc9i019dpkkhjc6m8xuz 2020-12-24 12:06:38 #calendars.Payroll kr4s1b1asqsp1b4hiyhgo7q9 1970-01-01 10:00:00 polyfoam.com.au!Archive
I'm guessing that the mailbox name is created the first time there is a transfer and not freed/recreated when the next mailbox is transferred, even if it starts with a different string.
Edit: I'm going to try to amend the code to:
XBACKUP *
.I just found and fixed(?) a memory leak while I was in there thinking about this. The branch doesn't build for me at the moment... I'll probably need to rebase it onto current master. Will that get in your way?
I guess it hadn't really occurred to me that shared mailboxes could have domains. I haven't seen them used much before!
I'm guessing that the mailbox name is created the first time there is a transfer and not freed/recreated when the next mailbox is transferred, even if it starts with a different string.
Well, cmd_apply_reserve
in backup/backupd.c assumes there's only one shared mailbox backup, so even though it calculates the first shared mailbox backup correctly, it only does it for the first one. This won't be the cause of the XBACKUP problem though, because XBACKUP sends one mailbox at a time.... But backupd_open_backup
(same file) thinks every backup has a userid and tries to use that as a key to track which backups it has open, which means it ends up tracking the first shared backup opened as if it were the only shared backup (because they all have a NULL userid).... So there's the bug, but I guess it's not so much a "bug" as it is "I didn't finish implementing it", doh!
If you're still enthusiastic to try patching this I won't get in your way, but if you want to handball it back to me, that's pretty reasonable!
Think I might have to handball this back to you. It's not a problem to use a different branch; the backup server I'm using has no other purpose.
Instead of backing up all shared mailbox as a single '%SHARED' backup, bundle them up at a deeper level (e.g. each top-level shared mailbox is its own "userid" with its own backup). Perhaps have a config option for the depth this should occur at, which would default at 0 for "all shared mailboxes in a single backup" like the current behaviour, 1 would be the behaviour described above, and larger numbers would produce more, finer grained backup files. You would choose a value that brings your worst case in line with normal behaviour.
This could help with cases like Deborah's where the single %SHARED backup is many times larger than a typical user backup, which makes tuning the backup configuration clumsy because it can't be tuned well for both cases.
I think this should be safe to enable on an existing system -- you'd end up with a bunch of new %SHARED.foo backups in addition to the existing %SHARED backup, and could then delete the old %SHARED backup once you're sure you don't need it anymore.