mailcow / mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕
https://mailcow.email
GNU General Public License v3.0
8.98k stars 1.18k forks source link

Use sdbox/mdbox #1007

Closed Lennix closed 5 years ago

Lennix commented 6 years ago

I was wondering if it would be possible to use sdbox or mdbox instead of maildir? I've already fiddled around with the config but it didn't seem to work.

The idea is that I'd like to use mail_attachment_dir to put larger attachments onto an external storage, but that only works for dbox. Also there's a nice performance boost when using dbox (on larger installations).

extremeshok commented 6 years ago

removed

extremeshok commented 6 years ago

removed

tgmedia-nz commented 6 years ago

I second this - I have used mdbox on large scale systems, maildir would have killed it with IO. Also, there is no coding required for moving mail to another storage layer, dovecot has a built in "alternative storage" with dbox/mdbox: https://wiki2.dovecot.org/MailboxFormat/dbox.

All my deploys of mailcow I have updated maildir to mdbox. For compression I use ZFS :)

@Lennix - you'll have to update the config in

mailcow-dockerized/data/Dockerfiles/dovecot/docker-entrypoint.sh:

SELECT CONCAT('mdbox:/var/vmail/',maildir) ....

(change the first maildir to mdbox)

and add

mail_location = mdbox:~/

to dovecot.conf

and rebuild your image (https://mailcow.github.io/mailcow-dockerized-docs/u_e-docker-cust_dockerfiles/)

Another way is to copy your docker-entrypoint.sh to the conf dir and update it, then link it into the image via docker-compose.yml:

      volumes:
        - ./data/conf/dovecot/docker-entrypoint.sh:/docker-entrypoint.sh:ro

I know it's hacky but works for me.

stevesbrain commented 6 years ago

@tgmedia-nz Out of curiosity, what sort of size were you at to notice the difference in IO?

extremeshok commented 6 years ago

Ok, ive been doing allot of research and testing regarding DBOX and have completely changed my opinion with it.

The biggest benefit is with the following: MDBOX + SIS + LZ4

SBOX reduced the load on needing to scan an entire directory on loading an inbox. There is also a vastly reduced IO as the dbox index files contain the message flags, so if a user marks 50emails as viewed, only 1 files is updated. (maildir, each file would be renamed).

SIS (Single Instance Attached Storage) allows for full attachment de-duplication, on one of my domains I tested with had a 10:1 space saving. ie. 10Gb of messages became 1Gb. SHA256 as the HASH avoids the collision issues observed with SHA1. SHA384/SHA512 can be used if one is still worried about hash collisions.

LZ4 gives a 2:1 text compression on the message bodies.

DBOX also enables alternative/slow storage for older emails, (ALT=, make sure the permission of this directory is root:root 0755, to prevent killing the index files if the alternative storage is not mounted) https://wiki.dovecot.org/MailLocation/dbox

"doveadm altmove" will move emails older than X to the alternative storage, so this is a simple weekly/montly cron to enable slow storage.

Notes: SIS requires LMTP via the postfix virtual transport.

With regards to SDBOX or MDBOX: MDBOX has a single file which contains many messages, this allows for faster backups and reduced IO, the map/index file points to the location of the email. If there is a corruption many many emails will be lost. There are more issues and complications when running via NFS. Also since the entire MDBOX file changes continuously it hammers backup systems, as the entire file has to be copied every backup, not just the chnages.

SDBOX each message is stored as a separate file with the attachment stripped and de-duplicated when running with SIS. The index contains the headers and flags etc of each email. If there is a corruption only the single corrupt email would be lost.


It's possible to use both mboxes and maildirs for the same user by configuring multiple namespaces. https://wiki.dovecot.org/Namespaces#Mixed_mbox_and_Maildir


As per Zafara regarding SIS https://doc.zarafa.com/trunk/Administrator_Manual/en-US/html/_single_instance_attachment_storage.html

6.5. Single Instance Attachment Storage
Since ZCP 6.30 the Zarafa Server provides Single Instance Attachment Storage to avoid redundant storage of attachments. This feature, as its name implies, only keeps one copy of each attachment when a message is sent to multiple recipients within the same server. This mechanism, thus, minimizes the disk space requirements and remarkably enhances delivery efficiency when messages with attachments sent to large distribution lists.
Let’s assume the following situation: user A belongs to a Zarafa server; he sends a message with 10 MB of attachments to 30 users that reside on the same server. In a normal situation 30 copies of the files would be saved on the database, leading to an inefficient usage of the storage space (310 MB of data). With single instance attachment store, only one copy of each attachment is saved on the database (only 10 MB of data in this example) and all the 30 users can access the attachment through a reference pointer.
andryyy commented 6 years ago

There isn't a "one size fits all" solution. I am totally aware there are mdbox, sdbox etc. Depending on your setup, mdbox might be the better solution with even more benefits. There are other reasons we use maildir right now. Everyone knows it, it is almost unbreakable and well supported. mailcow can be easily modified to use whatever mailbox format you want to, @tgmedia-nz summed it up above, thanks! 👍 We can create a variable to change that easily.

I agree about compression though, we should enable lz4.

extremeshok commented 6 years ago

We can create a variable to change that easily.

Having the option prompt during the intial setup on which storage method and defaulting to maildir would be ideal. ie. maildir sdbox mdbox sdbox + sis mdbox + sis

I can do some testing and push a patch if you are too busy ?

lavdnone commented 6 years ago

@andryyy "We can create a variable to change that easily." would be great will kill for anything lowering pressure on the disks some of us using nfs and glusterfs network distributed storages

Besides, I can't imagine going away from dovecot, at least for mailcow project

extremeshok commented 6 years ago

One can convert rom dbox format back to maildir with the usage of dsync. If we ever needed to move away from dovecot.

andryyy commented 6 years ago

Can anybody test how Dovecot reacts to maildir with "mail_attachment_fs = sis posix"? I will add this option later today.

lavdnone commented 6 years ago

@extremeshok SDBOX vs MDBOX what had lower disk usage? I assume both don't go through files just use index. So MDBOX will just contribute with lock problem (SOGO+ActiveSync+IMAP accessing it from, in my case, different servers). Not bothered much with SIS and LZ4, which I probably should be. If you can share your test environment configs to try.

I also think it disregards mail_location = mdbox:~/ in dovecot.conf, as data is in sql. To put index to a different location (local SSD) i had to change SQL query for domain creation where it has mail_location

extremeshok commented 6 years ago

@andryyy > Can anybody test how Dovecot reacts to maildir with "mail_attachment_fs = sis posix"? I will add this option later today. dovecot with Maildir ignores the following when they are present in the config

BTW, below is the correct way for SIS and the sha512 prevents hash collisions

# Support for mail attachment de-duplication (aka SIS aka Single Instance Storage)
mail_attachment_dir = /var/vmail/attachments
mail_attachment_hash = %{sha512}
mail_attachment_min_size = 64k
mail_attachment_fs = sis posix
extremeshok commented 6 years ago

Conversion from maildir with lz4 to mdbox + sis

MDBOX

mbox_dirty_syncs = yes
mbox_dotlock_change_timeout = 2 mins
mbox_lazy_writes = yes
mbox_lock_timeout = 5 mins
mbox_md5 = apop3d
mbox_min_index_size = 0
mbox_read_locks = fcntl
mbox_very_dirty_syncs = no
mbox_write_locks = dotlock fcntl
mdbox_preallocate_space = no
mdbox_purge_preserve_alt = no
mdbox_rotate_interval = 1d
mdbox_rotate_size = 16M

SIS

# Support for mail attachment de-duplication (aka SIS aka Single Instance Storage)
mail_attachment_dir = /var/vmail/attachments
mail_attachment_hash = %{sha512}
mail_attachment_min_size = 64k
mail_attachment_fs = sis posix

LZ4

# Enable zlib compression, 2:1 on text files
plugin {
  zlib_save_level = 9
  zlib_save = lz4
}
extremeshok commented 6 years ago

@lavdnone We have massive lock issues with maildir.. (2x dedicated standalone sogo servers, 150+ connections a second, webmail server, multiple imap load balancers, etc)

Server is enterprise SSD raid 1 (mirror) ZFS, 128GB DDR4 ecc,
maildir had a minimum I/O Delay of 80% sdbox had a minimum I/O delay of 35% mdbox seems to hover around 0-5%

mdbox with the default 2mb rotate size, ensures the files are 2MB. This is way quicker than thousands of less than 64K files.

Remember maildir requires the files to be renamed and linked/copied

tgmedia-nz commented 6 years ago

@extremeshok similar mail storage environment what I was using a few years ago. I'd recommend to push the rotate file size to 8/16MB for even better I/O results

extremeshok commented 6 years ago

Results of maildir + lz4 to mdbox+sis+lz4

--- Converting: community ( community@ )
Before (mailbox lz4): 2.8G
After (mdbox sis): 471M
Attachments Total (sis): 1.5G
--- Converting: abca ( abc@ )
Before (mailbox lz4): 137M
After (mdbox sis): 1000K
Attachments Total (sis): 1.6G
--- Converting: auxiliary ( auxiliary@ )
Before (mailbox lz4): 1.9G
After (mdbox sis): 244M
Attachments Total (sis): 2.3G
--- Converting: residents ( residents@ )
Before (mailbox lz4): 303M
After (mdbox sis): 29M
Attachments Total (sis): 2.3G
--- Converting: accounts ( accounts@ )
Before (mailbox lz4): 1.9G
After (mdbox sis): 244M
Attachments Total (sis): 3.3G
--- Converting: admin ( admin@ )
Before (mailbox lz4): 4.5G
After (mdbox sis): 407M
Attachments Total (sis): 4.9G
--- Converting: rooms ( rooms@ )
Before (mailbox lz4): 2.6G
After (mdbox sis): 247M
Attachments Total (sis): 5.9G
--- Converting: ceo ( ceo@ )
Before (mailbox lz4): 5.9G
After (mdbox sis): 732M
Attachments Total (sis): 8.4G
--- Converting: pro ( pro@ )
Before (mailbox lz4): 2.0G
After (mdbox sis): 188M
Attachments Total (sis): 8.9G
--- Converting: housing ( housing@ )
Before (mailbox lz4): 5.2G
After (mdbox sis): 1.2G
Attachments Total (sis): 9.9G
--- Converting: village ( village@ )
Before (mailbox lz4): 474M
After (mdbox sis): 71M
Attachments Total (sis): 9.9G
--- Converting: louise ( louise@ )
Before (mailbox lz4): 258M
After (mdbox sis): 32M
Attachments Total (sis): 11G
--- Converting: cherie ( cherie@ )
Before (mailbox lz4): 706M
After (mdbox sis): 60M
Attachments Total (sis): 11G
--- Converting: lanie ( lanie@ )
Before (mailbox lz4): 131M
After (mdbox sis): 4.3M
Attachments Total (sis): 11G
--- Converting: roekeya ( roekeya@ )
Before (mailbox lz4): 14G
After (mdbox sis): 1.3G
Attachments Total (sis): 17G
--- Converting: denzel ( denzel@ )
Before (mailbox lz4): 21M
After (mdbox sis): 15M
Attachments Total (sis): 17G
--- Converting: john ( john@ )
Before (mailbox lz4): 607M
After (mdbox sis): 29M
Attachments Total (sis): 18G
--- Converting: operations2 ( operations2@ )
Before (mailbox lz4): 1.7G
After (mdbox sis): 94M
Attachments Total (sis): 19G
--- Converting: kobie ( kobie@ )
Before (mailbox lz4): 2.2G
After (mdbox sis): 140M
Attachments Total (sis): 20G
--- Converting: operations ( operations@ )
Before (mailbox lz4): 1.7G
After (mdbox sis): 184M
Attachments Total (sis): 21G
--- Converting: serieta ( serieta@ )
Before (mailbox lz4): 7.1G
After (mdbox sis): 821M
Attachments Total (sis): 24G
--- Converting: deon ( deon@ )
Before (mailbox lz4): 508M
After (mdbox sis): 43M
Attachments Total (sis): 24G
--- Converting: tamlyn ( tamlyn@ )
Before (mailbox lz4): 6.4G
After (mdbox sis): 671M
Attachments Total (sis): 27G
--- Converting: noleen ( noleen@ )
Before (mailbox lz4): 663M
After (mdbox sis): 56M
Attachments Total (sis): 27G
--- Converting: joanne ( joanne@ )
Before (mailbox lz4): 8.4G
After (mdbox sis): 882M
Attachments Total (sis): 32G
--- Converting: laura ( laura@ )
Before (mailbox lz4): 3.0G
After (mdbox sis): 82M
Attachments Total (sis): 33G
--- Converting: gerrie ( gerrie@ )
Before (mailbox lz4): 2.7G
After (mdbox sis): 297M
Attachments Total (sis): 35G
--- Converting: peter ( peter@ )
Before (mailbox lz4): 1.3G
After (mdbox sis): 94M
Attachments Total (sis): 35G
--- Converting: jason ( jason@ )
Before (mailbox lz4): 55M
After (mdbox sis): 472K
Attachments Total (sis): 35G
--- Converting: rebecca ( rebecca@ )
Before (mailbox lz4): 506M
After (mdbox sis): 109M
Attachments Total (sis): 35G
--- Converting: accounts ( accounts@ )
Before (mailbox lz4): 2.4G
After (mdbox sis): 191M
Attachments Total (sis): 37G
--- Converting: vanessa ( vanessa@ )
Before (mailbox lz4): 1.3G
After (mdbox sis): 140M
Attachments Total (sis): 38G
--- Converting: timothy ( timothy@ )
Before (mailbox lz4): 6.2G
After (mdbox sis): 787M
Attachments Total (sis): 40G
--- Converting: jaymie ( jaymie@ )
Before (mailbox lz4): 1.2M
After (mdbox sis): 128K
Attachments Total (sis): 40G
--- Converting: sean ( sean@ )
Before (mailbox lz4): 852M
After (mdbox sis): 141M
Attachments Total (sis): 41G
--- Converting: michelle ( michelle@ )
Before (mailbox lz4): 100K
After (mdbox sis): 68K
Attachments Total (sis): 41G
--- Converting: accounts ( accounts@ )
Before (mailbox lz4): 100K
After (mdbox sis): 68K
Attachments Total (sis): 41G
--- Converting: elaine ( elaine@extremecooling.org )
Before (mailbox lz4): 1.4G
After (mdbox sis): 296M
Attachments Total (sis): 41G
--- Converting: root ( root@extremecooling.org )
Before (mailbox lz4): 2.1G
After (mdbox sis): 806M
Attachments Total (sis): 42G
--- Converting: sales ( sales@apollo-auto.com )
Before (mailbox lz4): 4.3G
After (mdbox sis): 975M
Attachments Total (sis): 44G
--- Converting: bounce ( bounce@apollo-auto.com )
Before (mailbox lz4): 104K
After (mdbox sis): 72K
Attachments Total (sis): 44G
--- Converting: admin ( admin@apollo-auto.com )
Before (mailbox lz4): 239M
After (mdbox sis): 36M
Attachments Total (sis): 44G
--- Converting: spam ( spam@apollo-auto.com )
Before (mailbox lz4): 4.0K
After (mdbox sis): 8.0K
Attachments Total (sis): 44G
--- Converting: carly ( carly@apollo-auto.com )
Before (mailbox lz4): 2.8G
After (mdbox sis): 118M
Attachments Total (sis): 45G
--- Converting: rachel ( rachel@apollo-auto.com )
Before (mailbox lz4): 285M
After (mdbox sis): 24M
Attachments Total (sis): 45G
--- Converting: help ( help@apollo-auto.com )
Before (mailbox lz4): 100K
After (mdbox sis): 72K
Attachments Total (sis): 45G
--- Converting: orders1 ( orders1@ )
Before (mailbox lz4): 46M
After (mdbox sis): 5.4M
Attachments Total (sis): 45G
--- Converting: orders2 ( orders2@ )
Before (mailbox lz4): 2.6M
After (mdbox sis): 580K
Attachments Total (sis): 45G
--- Converting: accounts ( accounts@ )
Before (mailbox lz4): 1.6G
After (mdbox sis): 119M
Attachments Total (sis): 46G
--- Converting: factory ( factory@ )
Before (mailbox lz4): 108K
After (mdbox sis): 76K
Attachments Total (sis): 46G
--- Converting: ray ( ray@ )
Before (mailbox lz4): 68M
After (mdbox sis): 11M
Attachments Total (sis): 46G
--- Converting: sales1 ( sales1@ )
Before (mailbox lz4): 4.0K
After (mdbox sis): 8.0K
Attachments Total (sis): 46G
--- Converting: sales2 ( sales2@ )
Before (mailbox lz4): 100K
After (mdbox sis): 68K
Attachments Total (sis): 46G
--- Converting: lab ( lab@ )
Before (mailbox lz4): 7.8M
After (mdbox sis): 412K
Attachments Total (sis): 46G
--- Converting: userone ( userone@ )
Before (mailbox lz4): 15M
After (mdbox sis): 9.3M
Attachments Total (sis): 46G
--- Converting: usertwo ( usertwo@ )
Before (mailbox lz4): 1.7M
After (mdbox sis): 264K
Attachments Total (sis): 46G
--- Converting: alison ( alison@ )
Before (mailbox lz4): 2.9G
After (mdbox sis): 255M
Attachments Total (sis): 47G
--- Converting: choppie ( choppie@ )
Before (mailbox lz4): 321M
After (mdbox sis): 11M
Attachments Total (sis): 47G
--- Converting: samantha ( samantha@ )
Before (mailbox lz4): 294M
After (mdbox sis): 42M
Attachments Total (sis): 47G
--- Converting: chiro ( chiro@ )
Before (mailbox lz4): 820M
After (mdbox sis): 98M
Attachments Total (sis): 48G
--- Converting: dawn ( dawn@ )
Before (mailbox lz4): 300M
After (mdbox sis): 68M
Attachments Total (sis): 48G
--- Converting: arthur ( arthur@ )
Before (mailbox lz4): 37M
After (mdbox sis): 2.1M
Attachments Total (sis): 48G
--- Converting: lloyd ( lloyd@ )
Before (mailbox lz4): 52M
After (mdbox sis): 33M
Attachments Total (sis): 48G
--- Converting: appledawn ( appledawn@ )
Before (mailbox lz4): 1.1M
After (mdbox sis): 852K
Attachments Total (sis): 48G
--- Converting: insure ( insure@ )
Before (mailbox lz4): 337M
After (mdbox sis): 27M
Attachments Total (sis): 48G
--- Converting: pauline ( pauline@ )
Before (mailbox lz4): 2.4G
After (mdbox sis): 119M
Attachments Total (sis): 49G
--- Converting: scanner ( scanner@ )
Before (mailbox lz4): 84K
After (mdbox sis): 60K
Attachments Total (sis): 49G
--- Converting: admin ( admin@ )
Before (mailbox lz4): 3.6G
After (mdbox sis): 90M
Attachments Total (sis): 52G
--- Converting: harryb ( harryb@ )
Before (mailbox lz4): 1.1G
After (mdbox sis): 53M
Attachments Total (sis): 53G
--- Converting: scanner ( scanner@ )
Before (mailbox lz4): 84K
After (mdbox sis): 60K
Attachments Total (sis): 53G
--- Converting: kevin ( kevin@ )
Before (mailbox lz4): 21M
After (mdbox sis): 1.9M
Attachments Total (sis): 53G
--- Converting: warren ( warren@ )
Before (mailbox lz4): 9.5M
After (mdbox sis): 464K
Attachments Total (sis): 53G
--- Converting: accounts ( accounts@ )
Before (mailbox lz4): 12M
After (mdbox sis): 1.7M
Attachments Total (sis): 53G
--- Converting: vincent ( vincent@ )
Before (mailbox lz4): 837M
After (mdbox sis): 70M
Attachments Total (sis): 53G
--- Converting: felicity ( felicity@ )
Before (mailbox lz4): 3.4G
After (mdbox sis): 208M
Attachments Total (sis): 55G
--- Converting: sales ( sales@ )
Before (mailbox lz4): 2.1G
After (mdbox sis): 259M
Attachments Total (sis): 56G
--- Converting: philip ( philip@ )
Before (mailbox lz4): 635M
After (mdbox sis): 81M
Attachments Total (sis): 57G
--- Converting: pe ( pe@ )
Before (mailbox lz4): 3.0G
After (mdbox sis): 428M
Attachments Total (sis): 58G
--- Converting: accounts ( accounts@ )
Before (mailbox lz4): 275M
After (mdbox sis): 26M
Attachments Total (sis): 58G
--- Converting: admin ( admin@ )
Before (mailbox lz4): 852M
After (mdbox sis): 84M
Attachments Total (sis): 58G
--- Converting: fax ( fax@ )
Before (mailbox lz4): 31M
After (mdbox sis): 16M
Attachments Total (sis): 58G
--- Converting: sales2 ( sales2@ )
Before (mailbox lz4): 140M
After (mdbox sis): 13M
Attachments Total (sis): 59G
--- Converting: service ( service@ )
Before (mailbox lz4): 855M
After (mdbox sis): 169M
Attachments Total (sis): 59G
--- Converting: daycare ( daycare@ )
Before (mailbox lz4): 124K
After (mdbox sis): 80K
Attachments Total (sis): 59G
--- Converting: admin ( admin@ )
Before (mailbox lz4): 3.0G
After (mdbox sis): 219M
Attachments Total (sis): 60G
--- Converting: pledge ( pledge@ )
Before (mailbox lz4): 124K
After (mdbox sis): 80K
Attachments Total (sis): 60G
--- Converting: tracey ( tracey@ )
Before (mailbox lz4): 276K
After (mdbox sis): 172K
Attachments Total (sis): 60G
--- Converting: kim ( kim@ )
Before (mailbox lz4): 56M
After (mdbox sis): 35M
Attachments Total (sis): 60G
--- Converting: ulrught ( ulrught@ )
Before (mailbox lz4): 495M
After (mdbox sis): 53M
Attachments Total (sis): 61G
--- Converting: francine ( francine@ )
Before (mailbox lz4): 4.0K
After (mdbox sis): 8.0K
Attachments Total (sis): 61G
--- Converting: leigh ( leigh@ )
Before (mailbox lz4): 4.0K
After (mdbox sis): 8.0K
Attachments Total (sis): 61G
--- Converting: info ( info@ )
Before (mailbox lz4): 4.0K
After (mdbox sis): 8.0K
Attachments Total (sis): 61G
--- Converting: finance ( finance@ )
Before (mailbox lz4): 4.0K
After (mdbox sis): 8.0K
Attachments Total (sis): 61G
--- Converting: anri ( anri@ )
Before (mailbox lz4): 4.0K
After (mdbox sis): 8.0K
Attachments Total (sis): 61G
--- Converting: sales ( sales@ )
Before (mailbox lz4): 1.7G
After (mdbox sis): 160M
Attachments Total (sis): 62G
--- Converting: carmen ( carmen@ )
Before (mailbox lz4): 1.3G
After (mdbox sis): 131M
Attachments Total (sis): 62G
--- Converting: keith ( keith@ )
Before (mailbox lz4): 631M
After (mdbox sis): 37M
Attachments Total (sis): 62G
--- Converting: gary ( gary@ )
Before (mailbox lz4): 11G
After (mdbox sis): 840M
Attachments Total (sis): 68G
--- Converting: deon ( deon@ )
Before (mailbox lz4): 1.9G
After (mdbox sis): 189M
Attachments Total (sis): 68G
--- Converting: elke ( elke@ )
Before (mailbox lz4): 6.4G
After (mdbox sis): 809M
Attachments Total (sis): 71G
--- Converting: astin ( astin@ )
Before (mailbox lz4): 1.4G
After (mdbox sis): 24M
Attachments Total (sis): 72G
--- Converting: cellcec ( cellcec@ )
Before (mailbox lz4): 254M
After (mdbox sis): 13M
Attachments Total (sis): 72G
--- Converting: jfrimpong ( jfrimpong@ )
Before (mailbox lz4): 574M
After (mdbox sis): 57M
Attachments Total (sis): 72G
--- Converting: sales ( sales@ )
Before (mailbox lz4): 12G
After (mdbox sis): 1.5G
Attachments Total (sis): 78G
--- Converting: accounts ( accounts@ )
Before (mailbox lz4): 2.6G
After (mdbox sis): 184M
Attachments Total (sis): 80G
--- Converting: admin ( admin@ )
Before (mailbox lz4): 1.3G
After (mdbox sis): 280M
Attachments Total (sis): 80G
--- Converting: jrockson ( jrockson@ )
Before (mailbox lz4): 1.9G
After (mdbox sis): 164M
Attachments Total (sis): 80G
--- Converting: accounts2 ( accounts2@ )
Before (mailbox lz4): 931M
After (mdbox sis): 103M
Attachments Total (sis): 81G
lavdnone commented 6 years ago

@extremeshok thanks for data looks like people sending attachments to each other 90% of the time

In my case with mdbox index files need to be moved back to shared glusterfs storage from local SSD Maildir index didn't matter much as it was listing mail files in folders anyway. With mdbox index will be part of emails. Will have to test performance and locks.

Lennix commented 6 years ago

Since this got some traction I was wondering if someone could comment how difficult it would be to integrate an option for the config-scripts to enable mdbox + addons?

chriswayg commented 6 years ago

I just noticed, that compression was enabled in February Enable maildir compression #1090.

In spite of this my newly installed mailcow server with dovecot IMAP mailbox storage uses twice as much space as anticipated. I transferred a large Gmail mailbox with 9 GB of mail (90,000 messages) to mailcow using imapsync. I was surprised to find out, that dovecot uses 21 GB of storage for the same data.

@extremeshok shows a large decrease in mailbox size from 'maildir + lz4' to 'mdbox+sis+lz4'.

Would the dbox format be able to decrease the amount of space used by the mailbox? Could this be added as an option in generate_config.sh?

hachre commented 6 years ago

@chriswayg: This sounds to me as if during your imapsync mails got duplicated. Generally if you transfer the "All Mail" directory, it will contain duplicates of every mail you have in a different folder. I recommend using Thunderbird and the "Remove Duplicate Messages (Alternative)" Addon to get rid of the duplicates. If you cut your used space in half you're at roughly the GMail size.

stale[bot] commented 6 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

extremeshok commented 6 years ago

I would like to work on this, in order to submitr a PR to have it added as a user selectable option for new installations.

Default will remain maildir without compression

Then adding the user selectable options option: enable compression for mail storage option: enable mdbox+sis instead of maildir

lavdnone commented 6 years ago

From what I saw compression is already Default in current

Wonder if there is a way to properly sync indexes between cluster nodes: "main reasons for dbox's high performance is that it uses Dovecot's index files as the only storage for message flags and keywords" Now I have index on local ssd for maildir: each node indexes on it's own. Those dovecot.index and map.index will probably get over-locked or messed up if laying on a glusterfs read by multiple nodes

extremeshok commented 6 years ago

@lavdnone ceph plugin is required for your use case, gluster has a terrible lag once the inbox grows beyond million mail mark

@chriswayg Obviously you have limited experience dealing with corporate email . Here is an example, a company with 100 users.. They email the same attachment to multiple people or multiple people receive the same attachment. Now imagine a single email has a 50mb dwg attachment. You can fill your email server within 1 week.

As per being at gmail size.. 1 organization contains 22 TByte of de-dupilicated emails. One of my personal email accounts is more than 7 million emails

This will be an optional setting, like full text search

lavdnone commented 6 years ago

@extremeshok Thank you for advice. I have seph in works as block device for the purpose, never downed that there is dovecot RADOS objects plugin. As fast fix for gluster had to make fuse file system to bypass gluster on local reads https://github.com/lavdnone/unionfs-fuse 2ook inboxes load great. Funny but works.

andryyy commented 6 years ago

It is not enough to stupidly enable mdbox and deduplication. You need to take care of cross-format ACLs, of encrypted deduplication, shared folders, compatibility of existing shares (!) etc.

I am not willing to accept a PR for this, because I guarantee any PR would just add/change 3 or 4 parameters, no tests will be made and that's it. "Worked for me".

Even worse, making it selectable and introduce total madness.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

udoL commented 5 years ago

I second this - I have used mdbox on large scale systems, maildir would have killed it with IO. Also, there is no coding required for moving mail to another storage layer, dovecot has a built in "alternative storage" with dbox/mdbox: https://wiki2.dovecot.org/MailboxFormat/dbox.

All my deploys of mailcow I have updated maildir to mdbox. For compression I use ZFS :)

@Lennix - you'll have to update the config in

mailcow-dockerized/data/Dockerfiles/dovecot/docker-entrypoint.sh:

SELECT CONCAT('mdbox:/var/vmail/',maildir) ....

(change the first maildir to mdbox)

and add

mail_location = mdbox:~/

to dovecot.conf

and rebuild your image (https://mailcow.github.io/mailcow-dockerized-docs/u_e-docker-cust_dockerfiles/)

Another way is to copy your docker-entrypoint.sh to the conf dir and update it, then link it into the image via docker-compose.yml:

      volumes:
        - ./data/conf/dovecot/docker-entrypoint.sh:/docker-entrypoint.sh:ro

I know it's hacky but works for me.

Hi, unfortunality this don't work for me with the actual version... I had to change the attributes entry in the database: update mailbox set attributes='{"force_pw_update":"0","tls_enforce_in":"0","tls_enforce_out":"0","sogo_access":"1","mailbox_format":"mdbox:","quarantine_notification":"never"}' where username='xxxxx@yyyy.de'; I want to switch to mdbox, due to backup issues with maildir - an full backup of millions of small files takes hours (days), which is no issue with mdbox.

lavdnone commented 5 years ago

I had to do it in DB too and change mailbox creation script. Is there a reason this setting is in active DB at all? not that we want to change mailbox format on the fly from admin panel or anything. It is effectively static. Easier would be to have it back at dovecot.conf

smares commented 4 years ago

I understand that mdbox and automatic deduplication can be troublesome, but why isn't mdbox at least configurable?