mjl- / mox

modern full-featured open source secure mail server for low-maintenance self-hosted email
https://www.xmox.nl
MIT License
3.56k stars 99 forks source link

Feature: Use Object Storage for IMAP storage #138

Closed daluntw closed 7 months ago

daluntw commented 7 months ago

Is it possible to use other storage method to store the mail ?

Other solution like Dovecot provide feature that can use S3 (compatible) storage to save mails, it will provide more flexible and speed when doing backup / disaster recovery.

mjl- commented 7 months ago

Hi @daluntw! This isn't possible at the moment. I can see the use-case, mostly around disaster recovery. (You would still have to backup the message index database periodically). Unless the s3-like storage is also self-hosted, this idea would go counter to the mox philosophy of self-hosted email.

Before doing this, I think we should first automatically compress and encrypt all stored messages. Then at least the (remote) storage cannot read any sensitive data.

If we do this, I don't want to include big SDK dependencies for s3. But I believe the S3 API is pretty simple and can probably be implemented standalone in a at most a few hundred lines.

I had a quick look at the (scarce) documentation at dovecot about s3 storage. It seems they do prefetching, and caching, and delays/batches some operations. That would make the implementation more complicated, and ideally we should do without most of that. Especially changes should be completed before we give the OK, otherwise we would accept deliveries and could lose the data on crash. We should probably have a read and write-through cache though, so recently accessed messages (e.g. those delivered just now) are fast to read, and old archived messages are only stored remotely. We may have to add a hint when opening a message file to use for caching: whether it is a one-time scan (search in all messages in a mailbox), or a targeted fetch. The good thing is that message files don't change after having been written.

It could also be worth thinking about using this external storage for backing up the index file periodically.

I don't have time/priority to work on this. If you want to give it a try, I could give some hints.

daluntw commented 7 months ago

Hi @mjl-

I have tried use S3FS to store message, can observed that significant increase in latency

Agree with you after thinking about it, using s3-like storage does lead to a huge increase in complexity and more problems later on for devs

as an alternative to disaster recovery, I'm doing with cron with hourly (or daily) CLI backup, and rclone to remote (encrypted if need), now I think this is an acceptable ways to do remote backup / disaster recovery

mjl- commented 7 months ago

as an alternative to disaster recovery, I'm doing with cron with hourly (or daily) CLI backup, and rclone to remote (encrypted if need), now I think this is an acceptable ways to do remote backup / disaster recovery

yes, this should be relatively efficient: the message files are hardlinked when you make a backup. syncing the files afterwards should be pretty efficient (only the added/removed files should be transferred). this is how i've set it up too.