Closed swdee closed 4 years ago
In further testing I increased vsz_limit = 4G
and it has same killed with signal 6 result.
I then manually indexed each folder in the mailbox instead of the mailbox user account as a whole. This resulted in the apparent memory leak to occur on the sent-mail
folder which has a smaller number of mails in it compared to other folders that successfully build an index.
Mailbox Folder | Mail Count | Index Build Result |
---|---|---|
msg_hold2 | 11074 | ok |
msg_hold | 12726 | ok |
b_globe | 804 | ok |
sent-mail | 7194 | coredumps |
Trash | 617 | ok |
Trash/junkles | 13 | ok |
Sent | 0 | ok |
msg_hold3 | 5045 | ok |
INBOX | 4734 | ok |
Is there some way we can manually build the index but get a progress report for each individual email being indexed so we can see which one is causing the memory leak?
I have found the email, there are two which have a file attachment containing perl/cgi code and are causing the memory leak when indexing the sent-mail
folder.
As a work around I have removed those two emails.
Which version of Xapian are you using ? Can you forward me content of the email causing memory leak ?
We are using the packages available on Fedora 31, so xapian-core-1.4.13-2.
I can't post the email here as it is commercially sensitive. If you post your public SSH key I will setup a node with the environment and problem email so you can investigate.
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7fM/a/DONp/IduuotIm/bsie1V7Mn6J23Ecr2zYID+JJJdm4AjwI9IXfFOMD4kD55g/bBCF4x1FdMcouBlR/11PSWQx+4r/eBQZg7i8R2hB2rGG9M44zNJLyZEz5fDtaBfz3gSiKBHD0dtGqAps/nuILknxgvDbCvWu43Me8VuR0tXLidG9EQcI/fOvGoykTxS9JEBYxMyIN7kslBClZnyCgZhJI24UR4EuphR9zxRXuPKbAu0Fxh/+q8tqkBqHXz5I97OZ8Bpdsl2HOFNVM/VHU8gX1I/5iVXqFaicEPSgJCugbzDWS+HS5dGFhFpmMtkNIfU06wiVB/chykclAz joan@gjlaptop
I have setup a temporary node demonstrating the problem so you can take a look. Make any changes you want on this node and don't worry about breaking anything as we will delete it once done.
ssh using your key to root@45.79.78.191
I have built the fts-xapian plugin using the following commands
cd ~/build
git clone https://github.com/grosjo/fts-xapian
cd ~/build/fts-xapian/
autoreconf -vi
./configure --with-dovecot=/usr/lib64/dovecot
make
make install
# then restart dovecot for changes to take effect
systemctl restart dovecot
The mailbox containing the two problem emails which causes the memory leak are at /home/vmail/shanon_test/mdbox/
To reproduce the coredump and problem take the following steps;
First delete existing xapian indexes
systemctl stop dovecot
rm -rf /home/vmail/shanon_test/mdbox/xapian-indexes
systemctl start dovecot
Manually index the "corrupt" folder containing the two emails.
doveadm index -u shanon_test -q corrupt
Wait around 3 minutes and the index-worker eventually exhausts vsz_limit default 256mb memory limit. Note I have not increased this limit as it takes to long to exhaust and is not necessary for indexing only 2 emails. I use top
to monitor when the index-worker has stopped/crashed.
To get crash log
$ tail -c 5000 /var/log/maillog
Feb 8 23:47:59 localhost dovecot[852]: indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
Feb 8 23:47:59 localhost dovecot[852]: indexer-worker: Error: what(): std::bad_alloc
Feb 8 23:48:01 localhost dovecot[852]: indexer: Error: Indexer worker disconnected, discarding 1 requests for shanon_test
Feb 8 23:48:01 localhost dovecot[852]: indexer-worker(shanon_test)<861>
5. The coredump files are written to `/var/lib/systemd/coredump/`
Furthermore I have found another mailbox that causes a crash, I haven't investigated properly yet, but it seems to be crashing on a email with a PDF attachment. Is there a way to configure indexing so it only indexes the email body and excludes indexing the attachments?
I added an option for indexing or not the attachments
And for the attachments, indexing only text ones.
Feb 9 16:19:03 localhost dovecot[17497]: indexer-worker(shanon_test)<17508>
Nice fix! I will pull your latest code and run it on the full mailboxes on the production server we are setting up and let you know how it goes.
We have done multiple runs on the full mailboxes from a production server and everything indexes nicely now with attachments=0
. Thanks for the fix and additional options for disabling attachment indexing.
Search speed is good with mailboxes in the 20k-30k message range returning in 0.01 seconds.
When building an index on an existing mailbox we get the following error on one particular mailbox.
This mailbox is only one third the size of some other mailboxes that complete an index build successfully.
We have vzs_limit set as;
We also compiled the fts-xapian plugin with
XAPIAN_COMMIT_LIMIT = 250
.Attached are the backtraces from the coredump.
bt.log bt-full.log