Closed pgnd closed 2 years ago
This is not (and will not) be built into any individual Dovecot FTS library (including flatcurve). The correct place to do this is in the Dovecot core libfts - which, as you point out, is done via fts_decoder configuration. decode2text is not really intended for production use - if you want to do attachment scanning, you should use fts_tika instead.
Edit: sorry, I meant using "fts_tika" configuration instead of "fts_decoder". Tika is a general purpose text extractor, so it can be used by every FTS driver for Dovecot. It doesn't make sense to hardcode this functionality into a specific FTS driver because of this.
then, i'll revisit tika again. i'm trying to avoid a 'fat', resource-insensive java solution for a local box -- a primary motivator to find/move to fts-flatcure from solr. solr+tika on one, local box is too heavy. admittedly, i've not tried a fts-flatcurve + tika solution to compare. yet.
xapian's capable of configurable attachment indexing/search, e.g.
https://xapian.org/docs/omega/overview.html https://github.com/xelkano/redmine_xapian https://wiki.bcs.rochester.edu/StatsWiki/HelpOnXapian
is dovecot-fts-flatcurve attachment capable, and configurable?
one alternative that (still?) seems to work is to enable fts_decoder,
https://doc.dovecot.org/settings/plugin/fts-plugin/#plugin_setting-fts-fts_decoder
as,
iiuc, the ->text converted attachment is scanned/indexed by fts.
but not using Xapian/flatcurve native capabilities.