Closed agittins closed 2 years ago
Well, it is quite some time I had a look at this, but as far as I remember emails are stored as plain text files.
We have mysql for user management, why not store emails in mysql and then use sphinx-search (http://sphinxsearch.com/)?
The mail solution as it is is enough for me, so I would not have a look at it, but I use sphinx in other settings and it is stunning, as is well known.
Thanks for your thoughts.
Rather than re-invent the wheel, I was simply looking for how best to tie one of the existing dovecot fulltext-search solutions into the docker-mailserver setup.
I wasn't aware of sphinxsearch so thanks for bringing it to my attention! :-) But for this particular job, I'm really just interested in integrating one of the existing solutions rather than implementing something new altogether (and I wouldn't be keen on pumping tens of GB of emails into mysql just for searching!) - it does support flat textfiles, too - but again, it's not one of the well-tested existing FTS solutions for Dovecot, so would be a project over on dovecot rather than here.
Does everyone who cares about fast searching just give up on self-hosting and use gmail? I'm surprised there doesn't seem to be a well-trodden path here :-/
So just one month later someone posted a handy blog post showing how they set up the xapian fts plugin (which is "community developed" rather than part of dovecot core).
I see Alpine has packages for it, too: https://pkgs.alpinelinux.org/packages?name=dovecot-fts-xapian*&branch=edge
If I can get some time I'll take a look at
So there's a PR on it's way! :-)
Thanks for your great work. This is now released in v3.4.0
Fantastic! Glad to see this merged :-D
I find that as-is, email searches in are slow, and often time out before returning any results. This is the case obviously for body searches, but I also experience it regularly on header searches as well. Part of my issue is that my mailstore is on networked storage (linode block storage) but ultimately the brute-force sequential scan for body searches would be problematic for me even on SSD.
I think the solution to this is setting up full-text search support in Dovecot, and to be honest, I have found that just trying to work out how best to do it in dovecot even standalone has me well confused, let alone integrating that cleanly into a containerised setup. The built-in option of fts_squat is deprecated, so we shouldn't be using that, but the official preferred options seem to be either buy the commercial version, or set up solr or lucene, none of which particularly excite me as options (cost, memory, management).
Two of the other options on that page, fts-xapian and fts-elastic look promising though, with fts-xapian perhaps being the simplest one as it looks like it is self-contained and won't need a separate search server container (I might be wrong on that). Both appear currently to be actively maintained.
So I guess my request is two-fold.
It would be great if it could become built-in to the core project, assuming it is low-overhead and/or can be enabled/disabled by config. Surely others have come up against this and have some solution they use.... maybe it's simply a matter of documenting it!