grosjo / fts-xapian

Dovecot FTS plugin based on Xapian
GNU Lesser General Public License v2.1
91 stars 19 forks source link

Huge xapian-indexes DB #130

Closed aacunha closed 9 months ago

aacunha commented 2 years ago

how is the common rate of size of mailbox and the xapian-indexes?

In my testing with mailboxes with 19GB and the indexes generate 1,4GB of indexes.

Another mailbox with 9,5GB of mails, got 1,6GB of indexes.

Its normal? Can i reduce this size with some config?

mweinelt commented 2 years ago

Also interested. For 11G of mails I get 21G of index size.

endreszabo commented 1 year ago

19G of mails produce 13,8G of index size here :)

grosjo commented 1 year ago

Yes, xapian creates sizeable index files. Not much to do about it

Bangaio65 commented 1 year ago

Wouldn't leaving fts_decoder empty (meaning don't index attachments) and reducing the range between partial and full lower the size of the index?

arodier commented 1 year ago

Actually, it is worst for me:

du -sh andre/mails/maildir andre/mails/indexes
1012M   andre/mails/maildir
4.5G    andre/mails/indexes

1G of emails produces almost 5G of indexes. Why ?

grosjo commented 9 months ago

Well, this is how Xapian is done. Can't do much

petersphilo commented 8 months ago

i've had some limited success by reducing the range between partial and full as mentioned by @Bangaio65 Also making sure to exclude Trash and Junk folders.. Here's my current config:

plugin {
    plugin = fts fts_xapian

    fts = xapian
    fts_xapian = partial=3 full=8 verbose=0 

    fts_autoindex = yes
    fts_enforced = yes

    fts_autoindex_exclude = \Junk
    fts_autoindex_exclude2 = \Trash
    fts_autoindex_exclude3 = \INBOX.spam
    fts_autoindex_exclude4 = \INBOX.Trash
    fts_autoindex_exclude5 = \spam
    fts_autoindex_exclude6 = \INBOX.Junk
}