Open figassis opened 3 years ago
My current idea of distributed/scalable deployment is putting go-imap-sql on top of CockroachDB with message blobs stored in some block storage (e.g. S3). This all is tracked in https://github.com/foxcpp/maddy/issues/279.
Attachment deduplication may be worth exploring though.
I agree. Probably WD gets most gains from attachment deduplication rather than the specific storage backend. Deduplication can easily be done by storing attachment hashes, and may even bring a performance improvement as you would often not need to send a file to storage. Deleting messages with attachments would only delete the file and hash if it's the last message pointing to it.
I'm not very familiar with the codebase, but I do have go experience, so I can help as soon as I find some bandwidth.
I second the S3
backend.
That also enables S3 compatible storage and can easily be self-hosted with minio.
I do not want the maintenance burden of a separate server/machine/etc., neither wildduck, maildir, S3 or cockroachDB.
I would appreciate the ability to store my mail in the same database as the metadata (e.g. PostgreSQL). Maybe not the same table as the metadata, but still. This would make consistent backups trivial and advanced search, filtering and analysis much easier. Same applies to attachments, would make things like for example deduplication trivial.
Early versions of imapsql backend stored message contents as a blob in the same table as metadata. That turned out to be a performance problem. Now message contents are stored into abstracted "external storage", with the only currently available implementation being fs directory. It is definitely possible to add an implementation that just stores blobs in table rows. This should not cause performance problems if the table is separate from metadata.
Use case
What problem you are trying to solve? Maildir is less space efficient and less scalable than a clustered database as a mail store.
Note alternatives you considered and why they are not useful. I've tried using Maildir over an S3 backend, but performance can be an issue.
Your idea for a solution
Compress messages, deduplicate attachments and store in a clustered database like MongoDB.
How your solution would work in general? Wildduck stores messages and attachments in MongoDB. It compresses data and deduplicates attachments, greatly reducing storage requirements and allowing us to easily scale our deployments. I currently use it in production and works great.