Closed richardshanasy closed 3 years ago
Are you using git version ?
Can you share gdb output (see https://www.dovecot.org/bugreport-mail for details) ?
Thanks for your response. Yeah I'm using the git version from maybe 10 days ago. I've been referring to the link you sent but I'm struggling to be able to get back to the (gdb) spot to enter bt, but here is what I've got so far:
Here is my gdb output:
richo@mail:/usr/lib/dovecot$ sudo gdb imap
[sudo] password for richo:
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from imap...
(No debugging symbols found in imap)
(gdb) r -u r
Starting program: /usr/lib/dovecot/imap -u r
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
* PREAUTH [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT MULTIAPPEND URL-PARTIAL CATENATE UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS BINARY MOVE SNIPPET=FUZZY PREVIEW=FUZZY LITERAL+ NOTIFY SPECIAL-USE] Logged in as r
imap(r)<60468><oyvtIf4QjF807AAAZU03Dg>: Info: FTS Xapian: Starting with partial=4 full=20 attachments=0 verbose=2
2 select "All Mail"
* FLAGS (\Answered \Flagged \Deleted \Seen \Draft $Forwarded $Phishing $NotJunk JunkRecorded $Junk NotJunk Junk NonJunk)
* OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft $Forwarded $Phishing $NotJunk JunkRecorded $Junk NotJunk Junk NonJunk \*)] Flags permitted.
* 80296 EXISTS
* 0 RECENT
* OK [UNSEEN 3] First unseen.
* OK [UIDVALIDITY 1601548821] UIDs valid
* OK [UIDNEXT 80371] Predicted next UID
* OK [HIGHESTMODSEQ 96757] Highest
2 OK [READ-WRITE] Select completed (0.006 + 0.000 + 0.005 secs).
3 Search text richard
imap(r)<60468><oyvtIf4QjF807AAAZU03Dg>: Info: FTS Xapian: fts_backend_xapian_get_last_uid
imap(r)<60468><oyvtIf4QjF807AAAZU03Dg>: Info: FTX Xapian: Set box 'All Mail' (e8ee520f6aaa795f17590000654d370e)
imap(r)<60468><oyvtIf4QjF807AAAZU03Dg>: Info: FTS Xapian: Unset box '(null)' ((null))
imap(r)<60468><oyvtIf4QjF807AAAZU03Dg>: Info: FTS Xapian: Committed 'unset_box' in 0 ms
imap(r)<60468><oyvtIf4QjF807AAAZU03Dg>: Info: FTS Xapian: Opening DB (RO) /home/r/mail/.imap/xapian-indexes/db_e8ee520f6aaa795f17590000654d370e
imap(r)<60468><oyvtIf4QjF807AAAZU03Dg>: Info: FTS Xapian: Get last UID of All Mail (e8ee520f6aaa795f17590000654d370e) = 161
imap(r)<60468><oyvtIf4QjF807AAAZU03Dg>: Error: indexer failed to index mailbox All Mail
3 NO [SERVERBUG] Internal error occurred. Refer to server log for more information. [2020-10-18 20:56:29] (43.964 + 0.003 + 43.966 secs).
And here is the corresponding log:
Oct 18 20:56:22 mail dovecot: imap(r)<59509><objkZ+6xbo80fYAl>: FTS Xapian: fts_backend_xapian_deinit
Oct 18 20:56:22 mail dovecot: imap(r)<59509><objkZ+6xbo80fYAl>: Deinit /home/r/mail/.imap/xapian-indexes
Oct 18 20:56:22 mail dovecot: imap(r)<59509><objkZ+6xbo80fYAl>: FTS Xapian: Unset box '(null)' ((null))
Oct 18 20:56:22 mail dovecot: imap(r)<59509><objkZ+6xbo80fYAl>: FTS Xapian: Committed 'unset_box' in 0 ms
Oct 18 20:56:22 mail dovecot: indexer-worker(r)<60473><oyvtIf4QjF807AAAZU03Dg:CUogLSERjF857AAAZU03Dg>: FTS Xapian: fts_backend_xapian_update_build_more
Oct 18 20:56:22 mail dovecot: indexer-worker(r)<60473><oyvtIf4QjF807AAAZU03Dg:CUogLSERjF857AAAZU03Dg>: FTS Xapian: Query= uid:"192"
Oct 18 20:56:22 mail dovecot: indexer-worker(r)<60473><oyvtIf4QjF807AAAZU03Dg:CUogLSERjF857AAAZU03Dg>: FTS Xapian: NGRAM(body,XBDY) -> 0 items, max length=0, (total 0 KB)
Oct 18 20:56:22 mail dovecot: indexer-worker(r)<60473><oyvtIf4QjF807AAAZU03Dg:CUogLSERjF857AAAZU03Dg>: FTS Xapian: fts_backend_xapian_update_build_more
Oct 18 20:56:22 mail dovecot: indexer-worker(r)<60473><oyvtIf4QjF807AAAZU03Dg:CUogLSERjF857AAAZU03Dg>: FTS Xapian: Query= uid:"192"
Oct 18 20:56:24 mail dovecot: imap-login: Login: user=<r>, method=PLAIN, rip=52.125.128.37, lip=192.168.1.200, mpid=60478, TLS, session=<FraxAu+xppY0fYAl>
Oct 18 20:56:24 mail dovecot: imap(r)<60478><FraxAu+xppY0fYAl>: FTS Xapian: Starting with partial=4 full=20 attachments=0 verbose=2
Oct 18 20:56:29 mail dovecot: indexer-worker(r)<60473><oyvtIf4QjF807AAAZU03Dg:CUogLSERjF857AAAZU03Dg>: FTS Xapian: NGRAM(body,XBDY) -> 60380 items, max length=63, (total 687 KB)
Oct 18 20:56:29 mail dovecot: indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
Oct 18 20:56:29 mail dovecot: indexer-worker: Error: what(): std::bad_alloc
Oct 18 20:56:29 mail dovecot: indexer: Error: Indexer worker disconnected, discarding 1 requests for r
Oct 18 20:56:29 mail dovecot: indexer-worker(r)<60473><oyvtIf4QjF807AAAZU03Dg:CUogLSERjF857AAAZU03Dg>: Fatal: master: service(indexer-worker): child 60473 killed with signal 6 (core dumped)
I would need the output of "bt full" command in gdb
How to I get out of imap (running inside gdb) into the gdb console to be able to write "bt full"? As imap doesn't crash, only the index-worker.
THe best way is to enable coredumps On ArchLInux, they are in /var/lib/systemd/coredump/ Which distribution are you using ?
Ubuntu 20.04. Thanks so much. I'll look into it..
Thanks heaps I'll have a play and get back to you tomorrow.
Thanks so much for your help grosjo. I rebuilt it and ran the doveadm index tool and it's workng now with no crashes!
Did you do anything other than rebuild because I am seeing these errors also:
Oct 30 09:31:48 fawkes dovecot: indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
Oct 30 09:31:48 fawkes dovecot: indexer-worker: Error: what(): std::bad_alloc
Oct 30 09:31:48 fawkes dovecot: indexer: Error: Indexer worker disconnected, discarding 1 requests for fts-test@example.com
Oct 30 09:31:48 fawkes dovecot: indexer-worker(fts-test@example.com)<22118><kWZ67N+ycNvAqJEI:iD5DBR3dm19mVgAA+4JX8g>: Fatal: master: service(indexer-worker): child 22118 killed with signal 6 (core dumps disabled - https://dovecot.org/bugreport.html#coredumps)
I tried 1.3.3. which did not help now I am back to master scratching my head. I have enabled core dumps so will report back here or open a new issue if I find anything useful.
Coredump with gdb would help very much
https://dovecot.org/bugreport.html#coredumps and some...
Nothing yet - it happens infrequently, sometimes taking several hours, sometimes several in a minute. This is a live system so I have to be a bit considerate. I don't know what causes it, other than the users effected are known to have large mailboxes.
I've also added the segment below to see if that helps.
default_vsz_limit = 0
service indexer-worker {
vsz_limit = 0
}
Should I rather remove that?
No, you shall keep that . Let's wait for the next coredump
I changed partial to 4 and haven't had an issue since.
@richardshanasy thanks. I gave that a try but no luck for me. See below.
@grosjo Afraid I have had to kill indexer-worker after 13 hours as memory usage for a reindex (I was hoping to cause the issue) grew to 12.7GB and CPU usage and was stuck at 100%, even though the total mail IMAP maildir size was < 6GB . I have thus commented out the vsz_limit segment since, other than enabling core dumps, that was the only change I made.
After restarting dovecot with partial=4, I tried to kick off a reindex of the users mailbox, and got a the error below:
Fatal: write(indexer) failed: Resource temporarily unavailable
I tried renaming the xapien_index folder but that did not help. However, running doveadm index
without -q does seem to be rebuilding their index.
So now I started reindexing my own maildir and within minutes hit the issue again:
Fatal: master: service(indexer-worker): child 21350 killed with signal 6 (core not dumped - https://dovecot.org/bugreport.html#coredumps - set /proc/sys/kernel/core_pattern to absolute path)
Prompted by the suggestion:
echo "/var/core/core.%e.%p" > /proc/sys/kernel/core_pattern
and I have left the indexer-worker rebuilding my index to hit the issue again rather than kill it off and end up with the same Resource temporarily unavailable
issue.
Caught a core. Here is your backtrace as requested:
(gdb) bt
#0 0x00007fbdb25857bb in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fbdb2570535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fbdb011d983 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007fbdb01238c6 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007fbdb0123901 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fbdb0123b89 in __cxa_rethrow () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fbdafa41334 in GlassWritableDatabase::replace_document (this=0x55b6488160b0, did=<optimized out>, document=...) at backends/glass/glass_database.cc:1379
#7 0x00007fbdafec9577 in fts_backend_xapian_index_text (dbx=0x55b648e1e8d0, uid=<optimized out>, field=<optimized out>, data=<optimized out>, p=<optimized out>, f=<optimized out>)
at fts-backend-xapian-functions.cpp:988
#8 0x00007fbdafec9ddd in fts_backend_xapian_update_build_more (_ctx=0x55b6482c2aa0,
data=0x55b648448780 "\n>> uSaVJz1Kdk3enjyZ4p5SkfJUwBZUC56n+qfWpb5OC03bn/YpPSa9PQOXkZhxVEgRpgn7MrUz\n>> 8zKHs8yzirOky5yX7Vw2JQoSNWVD2Yuyu8U02c/UgMREsl4ymuOWU5PzJjc690iecp4wb2C5\n>> 2fJNyyfyffO/XoFawV3RW6BbsLZgdKXnyvpV0Kqlq3pX"..., size=<optimized out>) at fts-backend-xapian.cpp:505
#9 0x00007fbdb2510c10 in ?? () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#10 0x00007fbdb25114bd in ?? () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#11 0x00007fbdb2511bbd in fts_build_mail () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#12 0x00007fbdb25178b0 in ?? () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#13 0x00007fbdb2919344 in mail_precache () from /usr/lib/dovecot/libdovecot-storage.so.0
#14 0x000055b6469d5c66 in ?? ()
#15 0x00007fbdb282eadf in io_loop_call_io () from /usr/lib/dovecot/libdovecot.so.0
#16 0x00007fbdb28300d6 in io_loop_handler_run_internal () from /usr/lib/dovecot/libdovecot.so.0
#17 0x00007fbdb282eb7c in io_loop_handler_run () from /usr/lib/dovecot/libdovecot.so.0
#18 0x00007fbdb282ece0 in io_loop_run () from /usr/lib/dovecot/libdovecot.so.0
#19 0x00007fbdb27af0d3 in master_service_run () from /usr/lib/dovecot/libdovecot.so.0
#20 0x000055b6469d56aa in main ()
(gdb)
And git:
# git rev-parse HEAD
5de8c5d3c016f38b80cc7095fc6e7ed0493affea
I guess you would like verbose=1 next?
Thanks for your help on this BTW.
If you can show the "verbose=1" log just before the crash There shall be a line of format : "FTS Xapian: NGRAM(%s,%s) -> %ld items, max length=%ld, (total %ld KB)"
I now have a couple of dumps, all the same place:
Oct 31 05:58:40 fawkes dovecot: indexer-worker(fts-test@example.com)<2771><AHBYCtb8nF/TCgAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 112898 items, max length=64, (total 1363 KB)
Oct 31 05:58:40 fawkes dovecot: indexer-worker(fts-test@example.com)<2771><AHBYCtb8nF/TCgAA+4JX8g>: FTS Xapian: Query= uid:"20"
Oct 31 05:58:40 fawkes dovecot: indexer-worker(fts-test@example.com)<2771><AHBYCtb8nF/TCgAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 964 items, max length=64, (total 11 KB)
Oct 31 05:58:41 fawkes dovecot: indexer-worker(fts-test@example.com)<2771><AHBYCtb8nF/TCgAA+4JX8g>: FTS Xapian: Query= uid:"20"
Oct 31 05:58:49 fawkes dovecot: indexer-worker(fts-test@example.com)<2771><AHBYCtb8nF/TCgAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 86535 items, max length=64, (total 1041 KB)
Oct 31 05:58:49 fawkes dovecot: indexer-worker(fts-test@example.com)<2771><AHBYCtb8nF/TCgAA+4JX8g>: FTS Xapian: Query= uid:"20"
Oct 31 05:58:49 fawkes dovecot: indexer-worker(fts-test@example.com)<2771><AHBYCtb8nF/TCgAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 963 items, max length=64, (total 11 KB)
Oct 31 05:58:49 fawkes dovecot: indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
Oct 31 05:58:49 fawkes dovecot: indexer-worker: Error: what(): std::bad_alloc
Oct 31 05:58:50 fawkes dovecot: indexer: Error: Indexer worker disconnected, discarding 62 requests for fts-test@example.com
Oct 31 05:58:50 fawkes dovecot: indexer-worker(fts-test@example.com)<2771><AHBYCtb8nF/TCgAA+4JX8g>: Fatal: master: service(indexer-worker): child 2771 killed with signal 6 (core dumped)
...
Oct 31 06:26:48 fawkes dovecot: indexer-worker(pb2@example.com)<2990></8RSOO4CnV/oDgAA+4JX8g:APQ7IGgDnV+uCwAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 112012 items, max length=64, (total 1352 KB)
Oct 31 06:26:48 fawkes dovecot: indexer-worker(pb2@example.com)<2990></8RSOO4CnV/oDgAA+4JX8g:APQ7IGgDnV+uCwAA+4JX8g>: FTS Xapian: Query= uid:"19"
Oct 31 06:26:48 fawkes dovecot: indexer-worker(pb2@example.com)<2990></8RSOO4CnV/oDgAA+4JX8g:APQ7IGgDnV+uCwAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 946 items, max length=63, (total 11 KB)
Oct 31 06:26:49 fawkes dovecot: indexer-worker(pb2@example.com)<2990></8RSOO4CnV/oDgAA+4JX8g:APQ7IGgDnV+uCwAA+4JX8g>: FTS Xapian: Query= uid:"19"
Oct 31 06:26:57 fawkes dovecot: indexer-worker(pb2@example.com)<2990></8RSOO4CnV/oDgAA+4JX8g:APQ7IGgDnV+uCwAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 87492 items, max length=64, (total 1052 KB)
Oct 31 06:26:58 fawkes dovecot: indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
Oct 31 06:26:58 fawkes dovecot: indexer-worker: Error: what(): std::bad_alloc
Oct 31 06:26:58 fawkes dovecot: indexer: Error: Indexer worker disconnected, discarding 212 requests for pb2@example.com
Oct 31 06:26:58 fawkes dovecot: indexer-worker(pb2@example.com)<3920></8RSOO4CnV/oDgAA+4JX8g:AsfwD7IDnV9QDwAA+4JX8g>: FTS Xapian: Starting with partial=3 full=20 attachments=0 verbose=1
Oct 31 06:26:58 fawkes dovecot: indexer-worker(pb2@example.com)<2990></8RSOO4CnV/oDgAA+4JX8g:APQ7IGgDnV+uCwAA+4JX8g>: Fatal: master: service(indexer-worker): child 2990 killed with signal 6 (core dumped)/
...
Oct 31 06:28:01 fawkes dovecot: indexer-worker(pb2@example.com)<3920></8RSOO4CnV/oDgAA+4JX8g:AsfwD7IDnV9QDwAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 112012 items, max length=64, (total 1352 KB)
Oct 31 06:28:02 fawkes dovecot: indexer-worker(pb2@example.com)<3920></8RSOO4CnV/oDgAA+4JX8g:AsfwD7IDnV9QDwAA+4JX8g>: FTS Xapian: Query= uid:"19"
Oct 31 06:28:02 fawkes dovecot: indexer-worker(pb2@example.com)<3920></8RSOO4CnV/oDgAA+4JX8g:AsfwD7IDnV9QDwAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 946 items, max length=63, (total 11 KB)
Oct 31 06:28:02 fawkes dovecot: indexer-worker(pb2@example.com)<3920></8RSOO4CnV/oDgAA+4JX8g:AsfwD7IDnV9QDwAA+4JX8g>: FTS Xapian: Query= uid:"19"
Oct 31 06:28:11 fawkes dovecot: indexer-worker(pb2@example.com)<3920></8RSOO4CnV/oDgAA+4JX8g:AsfwD7IDnV9QDwAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 87492 items, max length=64, (total 1052 KB)
Oct 31 06:28:11 fawkes dovecot: indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
Oct 31 06:28:11 fawkes dovecot: indexer-worker: Error: what(): std::bad_alloc
Oct 31 06:28:11 fawkes dovecot: indexer: Error: Indexer worker disconnected, discarding 41 requests for pb2@example.com
Oct 31 06:28:11 fawkes dovecot: indexer-worker(pb2@example.com)<3971></8RSOO4CnV/oDgAA+4JX8g:Re2ULfsDnV+DDwAA+4JX8g>: FTS Xapian: Starting with partial=3 full=20 attachments=0 verbose=1
Oct 31 06:28:12 fawkes dovecot: indexer-worker(pb2@example.com)<3920></8RSOO4CnV/oDgAA+4JX8g:AsfwD7IDnV9QDwAA+4JX8g>: Fatal: master: service(indexer-worker): child 3920 killed with signal 6 (core dumped)
What version of Xapian library are you using ?
1.4.17
Can you kindly update to latest git and try again ?
Can you kindly update to latest git and try again ?
Just pulled the changes in now and will start a reindex.
I'm afraid its not very healthy at all. I started a re-index which fails, no matter which way I run it. e.g.
# doveadm -Dv index -u fts-test@example.com \*
Debug: Loading modules from directory: /usr/lib/dovecot/modules
Debug: Module loaded: /usr/lib/dovecot/modules/lib20_fts_plugin.so
Debug: Module loaded: /usr/lib/dovecot/modules/lib20_zlib_plugin.so
Debug: Module loaded: /usr/lib/dovecot/modules/lib21_fts_xapian_plugin.so
Debug: Loading modules from directory: /usr/lib/dovecot/modules/doveadm
Debug: Module loaded: /usr/lib/dovecot/modules/doveadm/lib10_doveadm_sieve_plugin.so
Debug: Module loaded: /usr/lib/dovecot/modules/doveadm/lib20_doveadm_fts_plugin.so
doveadm(fts-test@example.com)<18705><>: Debug: auth USER input: fts-test@example.com home=/var/vmail/example.com/fts-test/ uid=2000 gid=2000
doveadm(fts-test@example.com): Debug: Effective uid=2000, gid=2000, home=/var/vmail/example.com/fts-test/
doveadm(fts-test@example.com): Debug: Namespace inbox: type=private, prefix=, sep=, inbox=yes, hidden=no, list=yes, subscriptions=yes location=maildir:/var/vmail/example.com/fts-test/
doveadm(fts-test@example.com): Debug: maildir++: root=/var/vmail/example.com/fts-test, index=, indexpvt=, control=, inbox=/var/vmail/example.com/fts-test, alt=
doveadm(fts-test@example.com): Debug: Namespace : Using permissions from /var/vmail/example.com/fts-test: mode=0700 gid=default
doveadm(fts-test@example.com): Info: FTS Xapian: Starting with partial=3 full=20 attachments=0 verbose=1
doveadm(fts-test@example.com): Debug: Mailbox Trash: Mailbox opened because: index
doveadm(fts-test@example.com): Info: FTS Xapian: fts_backend_xapian_get_last_uid
doveadm(fts-test@example.com): Error: FTS Xapian: GetLastUID: Can not open db RO
doveadm(fts-test@example.com): Error: Mailbox Trash: Status lookup failed: Internal error occurred. Refer to server log for more information. [2020-11-01 10:51:07]
doveadm(fts-test@example.com): Debug: Mailbox Trash.2019: Mailbox opened because: index
doveadm(fts-test@example.com): Info: FTS Xapian: fts_backend_xapian_get_last_uid
doveadm(fts-test@example.com): Info: FTS Xapian: Committed 'unset_box' in 0 ms
doveadm(fts-test@example.com): Error: FTS Xapian: GetLastUID: Can not open db RO
doveadm(fts-test@example.com): Error: Mailbox Trash.2019: Status lookup failed: Internal error occurred. Refer to server log for more information. [2020-11-01 10:51:07]
doveadm(fts-test@example.com): Debug: Mailbox Trash.2018: Mailbox opened because: index
doveadm(fts-test@example.com): Info: FTS Xapian: fts_backend_xapian_get_last_uid
doveadm(fts-test@example.com): Info: FTS Xapian: Committed 'unset_box' in 0 ms
doveadm(fts-test@example.com): Error: FTS Xapian: GetLastUID: Can not open db RO
doveadm(fts-test@example.com): Error: Mailbox Trash.2018: Status lookup failed: Internal error occurred. Refer to server log for more information. [2020-11-01 10:51:07]
doveadm(fts-test@example.com): Debug: Mailbox Trash.2017: Mailbox opened because: index
doveadm(fts-test@example.com): Info: FTS Xapian: fts_backend_xapian_get_last_uid
doveadm(fts-test@example.com): Info: FTS Xapian: Committed 'unset_box' in 0 ms
doveadm(fts-test@example.com): Error: FTS Xapian: GetLastUID: Can not open db RO
doveadm(fts-test@example.com): Error: Mailbox Trash.2017: Status lookup failed: Internal error occurred. Refer to server log for more information. [2020-11-01 10:51:07]
...
Basically it does this on every folder. I cannot get past this point to reindex & provoke a core.
The log from the last core before your last set of commits was:
Oct 31 23:21:54 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Oct 31 23:21:54 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 830 items, max length=48, (total 9 KB)
Oct 31 23:21:55 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: Refreshing after 12720 ms (vs 300000) and 11 updates (vs 1000
000) and 1197 KB ...
Oct 31 23:21:55 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: Committed 'refreshing' in 321 ms
Oct 31 23:21:55 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: Opening DB (RW) /var/vmail/example.com/fts-test/xapian-index
es/db_906ac8b3586eb17a47ece7f562737c03
Oct 31 23:21:55 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: Opening DB (RW) /var/vmail/example.com/fts-test/xapian-index
es/db_906ac8b3586eb17a47ece7f562737c03 : Done
Oct 31 23:21:55 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Oct 31 23:21:59 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 62309 items, max length=64, (total 662 KB
)
Oct 31 23:22:00 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Oct 31 23:22:00 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 662 items, max length=31, (total 6 KB)
Oct 31 23:22:01 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Oct 31 23:22:01 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 0 items, max length=0, (total 0 KB)
Oct 31 23:22:01 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Oct 31 23:22:01 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 902 items, max length=36, (total 10 KB)
Oct 31 23:22:02 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Oct 31 23:22:02 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 20937 items, max length=61, (total 217 KB
)
Oct 31 23:22:03 fawkes dovecot: indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
Oct 31 23:22:03 fawkes dovecot: indexer-worker: Error: what(): std::bad_alloc
Oct 31 23:22:03 fawkes dovecot: indexer: Error: Indexer worker disconnected, discarding 10 requests for fts-test@example.com
Oct 31 23:22:04 fawkes dovecot: indexer-worker(fts-test@example.com)<29005><uOPEAyjxnV9NcQAA+4JX8g>: Fatal: master: service(indexer-worker): child 29005 killed with signal 6 (core dumped)
Worryingly the xapian_index folder was growing in size each time, and for the above user had become 4GB+ in size while their mailbox was 6GB.
Can you please provide the coredump/gdb , based on the latest git ?
As per my previous comment, I cannot provoke a core dump because the latest git will not re-index. Any attempt to create a new index for a mailbox returns the following error per folder:
Error: FTS Xapian: GetLastUID: Can not open db RO
It seems it now needs an existing DB, so I have had to revert to 1.4.1. Unfortunately, this appears to have also triggered a re-index of almost every mailbox, not just the mailbox provoking the core dump. This is a live system, albeit with a relatively small (but large) number of mailboxes.
As soon as this completes, I'm going to take a copy of the mailbox that provokes the core so I can run further tests and try figure out what is causing the core. At the moment the HEAD is unfortunately unusable for me.
ooop, my bad
I changed a file without testting too much.
I push a fix. can you try with the HEAD again ?
Your fix appears to have resolved the Error: FTS Xapian: GetLastUID: Can not open db RO
error.
Reindexing now.
Backtrace:
(gdb) bt
#0 0x00007f646086d7bb in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f6460858535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007f645e405983 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007f645e40b8c6 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007f645e40b901 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007f645e40bb89 in __cxa_rethrow () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007f645dd29334 in GlassWritableDatabase::replace_document (this=0x55c1404e13f0, did=<optimized out>, document=...) at backends/glass/glass_database.cc:1379
#7 0x00007f645e1b074f in fts_backend_xapian_index_text (backend=0x55c13c4e4ae0, uid=<optimized out>, field=<optimized out>, data=<optimized out>) at fts-backend-xapian-functions.cpp:1028
#8 0x00007f645e1b0fe8 in fts_backend_xapian_update_build_more (_ctx=0x55c13c52e2e0,
data=0x55c143b02950 "\n>> e9C/fDqPO91Ygufgtco6t07yYZuVcEkuYrXGjkFRzEmx8NXlNRusGcgsq/EoflU8iQ8TXzAV\n>> UVJbrUDqnYzU1D9A82WUkkFdBJSif31GTAnFFNDfmf75DzJSCtO4dmQsHqXpKAjBWFhqHYtL\n>> WcwGotDI4fFrOMKuUji7w0UlwxazcE5c8Ki38Y4l957s"..., size=<optimized out>) at fts-backend-xapian.cpp:518
#9 0x00007f64607f8c10 in ?? () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#10 0x00007f64607f94bd in ?? () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#11 0x00007f64607f9bbd in fts_build_mail () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#12 0x00007f64607ff8b0 in ?? () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#13 0x00007f6460c01344 in mail_precache () from /usr/lib/dovecot/libdovecot-storage.so.0
#14 0x000055c13c01ec66 in ?? ()
#15 0x00007f6460b16adf in io_loop_call_io () from /usr/lib/dovecot/libdovecot.so.0
#16 0x00007f6460b180d6 in io_loop_handler_run_internal () from /usr/lib/dovecot/libdovecot.so.0
#17 0x00007f6460b16b7c in io_loop_handler_run () from /usr/lib/dovecot/libdovecot.so.0
#18 0x00007f6460b16ce0 in io_loop_run () from /usr/lib/dovecot/libdovecot.so.0
#19 0x00007f6460a970d3 in master_service_run () from /usr/lib/dovecot/libdovecot.so.0
#20 0x000055c13c01e6aa in main ()
(gdb)
Log leading up to core:
Nov 1 18:53:06 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 55923 items, max length=64, (total 610 KB)
Nov 1 18:53:07 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Nov 1 18:53:07 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 409 items, max length=26, (total 4 KB)
Nov 1 18:53:07 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Nov 1 18:53:07 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 0 items, max length=0, (total 0 KB)
Nov 1 18:53:07 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Nov 1 18:53:07 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 664 items, max length=37, (total 6 KB)
Nov 1 18:53:08 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Nov 1 18:53:08 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 22532 items, max length=63, (total 243 KB)
Nov 1 18:53:09 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Nov 1 18:53:09 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 902 items, max length=53, (total 10 KB)
Nov 1 18:53:09 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: Query= uid:"1817"
Nov 1 18:53:14 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 64313 items, max length=63, (total 696 KB)
Nov 1 18:53:14 fawkes dovecot: indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
Nov 1 18:53:14 fawkes dovecot: indexer-worker: Error: what(): std::bad_alloc
Nov 1 18:53:14 fawkes dovecot: indexer: Error: Indexer worker disconnected, discarding 10 requests for fts-test@example.com
Nov 1 18:53:15 fawkes dovecot: indexer-worker(fts-test@example.com)<1264><oHNkMt4Dn1/wBAAA+4JX8g>: Fatal: master: service(indexer-worker): child 1264 killed with signal 6 (core dumped)
My last change made sure there was at least 10Mb of available RAM before indexing a mail. Here xapian (as it it doc->replace_doc) seems running out of RAM
What is your RAM setup in your server ? Have you kept the "default_vsz_limit = 0" parameter ?
I thought you might ask this :-)
The server has 16GB physical/16GB swap, running with 3.5GB used before I started the indexer-worker. So plenty. As I mentioned previously, running with default_vsz_limit = 0
chews up memory - indexer-worker got to 12.7GB and started swapping, so I had to kill it and remove the default_vsz_limit parameter.
This is a live server, so I cannot afford to give infinite memory to what appeared to be a memory leak. If you can suggest something less (like 4GB) I'm happy to try that (what format is default_vsz_limit?).
try service indexer-worker { vsz_limit = 4G }
And that specific mailbox where it crashs is about what size ?
Thanks, that worked. Indexing completed. The mailbox is 6.7GB. I watched indexer-worker with htop
and it grew to about 470MB.
So I guess the default size is the issue. Well, at least that is sorted. I'm going to clean the xapian_index directory of the other mailboxes and reindex those as each time indexer-worker cored, several hundred MB of fs space got chewed up.
Quick question:
Each time I start an index, I get the following:
# doveadm index -q -u fts-test@example.com \*
doveadm(fts-test@example.com): Info: FTS Xapian: Starting with partial=3 full=20 attachments=0 verbose=1
doveadm(fts-test@example.com): Fatal: write(indexer) failed: Resource temporarily unavailable
but indexer-worker does run in the background. It does this now on every mailbox. Is this really an error? Either way, it needs addressing.
cool I moved the allocation of Xapian docs in Heap
for the error you mention, can you dump again your doveconf -n ?
# doveconf -n
# 2.3.4.1 (f79e8e7e4): /etc/dovecot/dovecot.conf
# Pigeonhole version 0.5.4 ()
# OS: Linux 4.19.0-10-amd64 x86_64 Debian 10.6
# Hostname: hostname.example.com
auth_master_user_separator = *
auth_mechanisms = plain login
first_valid_gid = 2000
first_valid_uid = 2000
mail_gid = vmail
mail_location = maildir:%h
mail_plugins = " fts fts_xapian zlib"
mail_privileged_group = mail
mail_uid = vmail
managesieve_notify_capability = mailto
managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext imapsieve vnd.dovecot.imapsieve
namespace inbox {
inbox = yes
location =
mailbox Archive {
auto = create
special_use = \Archive
}
mailbox Archives {
auto = no
special_use = \Archive
}
mailbox Bin {
auto = no
special_use = \Trash
}
mailbox Deleted {
auto = no
special_use = \Trash
}
mailbox Drafts {
auto = subscribe
special_use = \Drafts
}
mailbox Junk {
auto = create
special_use = \Junk
}
mailbox Sent {
auto = subscribe
special_use = \Sent
}
mailbox "Sent Messages" {
auto = no
special_use = \Sent
}
mailbox Spam {
auto = no
special_use = \Junk
}
mailbox Trash {
auto = subscribe
special_use = \Trash
}
prefix =
}
passdb {
args = /etc/dovecot/conf.d/dovecot-sql-master.conf.ext
driver = sql
master = yes
result_success = continue
}
passdb {
args = /etc/dovecot/dovecot-sql.conf.ext
driver = sql
}
plugin {
fts = xapian
fts_autoindex = yes
fts_autoindex_exclude = \Junk
fts_autoindex_exclude2 = \Trash
fts_enforced = yes
fts_filters = lowercase english-possessive
fts_languages = en
fts_xapian = partial=3 full=20 attachments=0 verbose=1
imapsieve_mailbox1_before = file:/var/mail/vmail/sieve/global/report-spam.sieve
imapsieve_mailbox1_causes = COPY
imapsieve_mailbox1_name = Junk
imapsieve_mailbox2_before = file:/var/mail/vmail/sieve/global/report-ham.sieve
imapsieve_mailbox2_causes = COPY
imapsieve_mailbox2_from = Junk
imapsieve_mailbox2_name = *
plugin = fts fts_xapian
recipient_delimiter = +
sieve = file:/var/vmail/sieve/%d/%n/scripts;active=/var/vmail/sieve/%d/%n/active-script.sieve
sieve_before = /var/vmail/sieve/global/spam-global.sieve
sieve_global = /var/vmail/sieve/global
sieve_global_extensions = +vnd.dovecot.pipe
sieve_pipe_bin_dir = /usr/bin
sieve_plugins = sieve_imapsieve sieve_extprograms
zlib_save = gz
zlib_save_level = 6
}
protocols = " imap lmtp sieve pop3 sieve"
service auth-worker {
user = nobody
}
service auth {
unix_listener /var/spool/postfix/private/auth {
group = postfix
mode = 0660
user = postfix
}
}
service indexer-worker {
vsz_limit = 1 G
}
service managesieve-login {
inet_listener sieve {
port = 4190
}
}
service managesieve {
process_limit = 1024
}
service stats {
unix_listener stats-writer {
group = vmail
mode = 0660
user = vmail
}
}
ssl = required
ssl_cert = </etc/letsencrypt/live/example.com/fullchain.pem
ssl_dh = # hidden, use -P to show it
ssl_key = # hidden, use -P to show it
userdb {
args = /etc/dovecot/dovecot-sql.conf.ext
driver = sql
}
protocol lmtp {
mail_plugins = " fts fts_xapian zlib sieve"
}
protocol lda {
mail_plugins = " fts fts_xapian zlib sieve"
}
protocol imap {
mail_plugins = " fts fts_xapian zlib imap_sieve"
}
Shall we close this issue and open another one regarding this "Resource temporarily unavailable" issue ?
I close this bug as it is solved now. I created a new issue for the "resource temporarily unavaible" https://github.com/grosjo/fts-xapian/issues/62
Sorry for the late respomnse - I'm dealing with 2 other software releases at the moment.
I'm not so sure the issue is fixed - that was just the one account. I started a sequential reindex of all the remaining mail accounts, and there were 5 more core dumps. I did see one indexer-worker's memory usage exceeded 4GB then core dump. In another, the user had already been indexed, received email, the indexer-worker appeared to reindex (not sure why) and then core dumped.This backtrace:
(gdb) bt
#0 0x00007f9a7e3df7bb in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f9a7e3ca535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007f9a7bf77983 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007f9a7bf7d8c6 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007f9a7bf7d901 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007f9a7bf7db89 in __cxa_rethrow () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007f9a7b89b334 in GlassWritableDatabase::replace_document (this=0x564a61f564c0, did=<optimized out>, document=...) at backends/glass/glass_database.cc:1379
#7 0x00007f9a7bd2274f in fts_backend_xapian_index_text (backend=0x564a61f899e0, uid=<optimized out>, field=<optimized out>, data=<optimized out>) at fts-backend-xapian-functions.cpp:1028
#8 0x00007f9a7bd22fe8 in fts_backend_xapian_update_build_more (_ctx=0x564a62058e50,
data=0x564a6908be70 "\n> nZ6eCtcfDC8SIORIxRKFVAn+v7xfQxIvG13QABucV2guMpcXFZdayirMlvLKqpqyquqKmtra\n> qjowF9V7y2vr98Ksaagoqmr8Oqx44t0GL6YYzelkuQuHCpeupL8ETqMzm7SkMdVeYNmQAErq\n> zNNMfUCvy62prKvcuxc+XC0cFnvqG2qr/4Dj0qO2f9BQBOv"..., size=<optimized out>) at fts-backend-xapian.cpp:518
#9 0x00007f9a7e36ac10 in ?? () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#10 0x00007f9a7e36b4bd in ?? () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#11 0x00007f9a7e36bbbd in fts_build_mail () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#12 0x00007f9a7e3718b0 in ?? () from /usr/lib/dovecot/modules/lib20_fts_plugin.so
#13 0x00007f9a7e773344 in mail_precache () from /usr/lib/dovecot/libdovecot-storage.so.0
#14 0x0000564a605e1c66 in ?? ()
#15 0x00007f9a7e688adf in io_loop_call_io () from /usr/lib/dovecot/libdovecot.so.0
#16 0x00007f9a7e68a0d6 in io_loop_handler_run_internal () from /usr/lib/dovecot/libdovecot.so.0
#17 0x00007f9a7e688b7c in io_loop_handler_run () from /usr/lib/dovecot/libdovecot.so.0
#18 0x00007f9a7e688ce0 in io_loop_run () from /usr/lib/dovecot/libdovecot.so.0
#19 0x00007f9a7e6090d3 in master_service_run () from /usr/lib/dovecot/libdovecot.so.0
#20 0x0000564a605e16aa in main ()
(gdb)
The log:
Nov 2 11:23:49 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 0 items, max length=0, (total 0 KB)
Nov 2 11:23:49 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: Query= uid:"1009"
Nov 2 11:23:53 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 61673 items, max length=64, (total 676 KB)
Nov 2 11:23:56 fawkes postfix/smtps/smtpd[12309]: warning: unknown[212.70.149.69]: SASL LOGIN authentication failed: UGFzc3dvcmQ6
Nov 2 11:23:57 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: Query= uid:"1009"
Nov 2 11:23:57 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 948 items, max length=62, (total 11 KB)
Nov 2 11:24:00 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: Query= uid:"1009"
Nov 2 11:24:00 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 0 items, max length=0, (total 0 KB)
Nov 2 11:24:00 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: Query= uid:"1009"
Nov 2 11:24:00 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 0 items, max length=0, (total 0 KB)
Nov 2 11:24:00 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: Query= uid:"1009"
Nov 2 11:24:00 fawkes postfix/smtps/smtpd[12309]: lost connection after AUTH from unknown[212.70.149.69]
Nov 2 11:24:00 fawkes postfix/smtps/smtpd[12309]: disconnect from unknown[212.70.149.69] ehlo=1 auth=0/1 rset=1 commands=2/3
Nov 2 11:24:01 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 20742 items, max length=62, (total 224 KB)
Nov 2 11:24:04 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: Query= uid:"1009"
Nov 2 11:24:04 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 0 items, max length=0, (total 0 KB)
Nov 2 11:24:04 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: Query= uid:"1009"
Nov 2 11:24:08 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 59007 items, max length=64, (total 628 KB)
Nov 2 11:24:10 fawkes dovecot: indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
Nov 2 11:24:10 fawkes dovecot: indexer-worker: Error: what(): std::bad_alloc
Nov 2 11:24:11 fawkes dovecot: indexer: Error: Indexer worker disconnected, discarding 2 requests for aaaaa@example.com
Nov 2 11:24:13 fawkes dovecot: indexer-worker(aaaaa@example.com)<12410><sVs3GxyzxtMqAoAQZFQAACnixEoCihl6:2GqlJubon196MAAA+4JX8g>: Fatal: master: service(indexer-worker): child 12410 killed with signal 6 (core dumped)
It seems that you are not using the latest git, as the "replace_document" is on line 1053, not 1028
please try latest chanegs, so we can see a bit more details on that error (try/catch around the culprit)
Started a reindex of all mail accounts from current HEAD (7130db3), with vsz_limit = 4 GB and partial=4.
I reduced (200M) the "vsz_limit" on my own production server
I get proper handling of the low memory:
Nov 3 11:30:07 gjserver dovecot[910516]: indexer-worker(admin@grosjo.net)<911132><IdNiODw/oV/v5g0A0thIag:FuxdJz0/oV8c5w0A0thIag>: FTS Xapian: Low memory Nov 3 11:30:09 gjserver dovecot[910516]: indexer-worker(admin@grosjo.net)<911132><IdNiODw/oV/v5g0A0thIag:FuxdJz0/oV8c5w0A0thIag>: FTS Xapian: Committed 'Low memory indexing' in 1224 ms
I've never seen those warnings. With the default and even at 1GB, I get the core dumps I've been posting, which is why I have set it at 4GB temporarily. Its now at 3.1GB usage still indexing the first account, so worryingly high but no core yet, maillog 2.5GB.
Unfortunately I cannot spend too much more time on this as I have other priorities ATM. Pity, because it is so much faster...
With latest git, you shall not get core dumps, as the memory error is between try/catch.
Let me know when you have more time
@ams001 perhaps stupid question, but cannot be some other limit than vsz_limit
there? eg. systemd, OS...
@slavkoja No, I see the memory usage grow to whatever I have vsz_limit
set to and then the process disappears.I'm now logging times to see if that is the cause of the core, although @grosjo mentions I wont be seeing a core anymore.
@grosjo While the core is no more, I'm still concerned about excessive memory usage as indexer-worker's memory usage has grown to 3.7GB (4GB limit) and now I start to see the std::alloc errors (log below). FYI, the size of this first mailbox is 5.7GB, yet the xapian index directory is 4.7GB. In a previous run, the index grew to 11.2GB!!!
...
Nov 3 16:00:33 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Query= uid:"656331"
Nov 3 16:00:33 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 790 items, max length=44, (total 9 KB)
Nov 3 16:00:52 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Query= uid:"656331"
Nov 3 16:00:53 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 18819 items, max length=63, (total 214 KB)
Nov 3 16:01:12 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Query= uid:"656331"
Nov 3 16:01:12 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 0 items, max length=0, (total 0 KB)
Nov 3 16:01:12 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Query= uid:"656331"
Nov 3 16:01:13 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 35735 items, max length=63, (total 412 KB)
Nov 3 16:01:32 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Query= uid:"656331"
Nov 3 16:01:32 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 0 items, max length=0, (total 0 KB)
Nov 3 16:01:32 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Query= uid:"656331"
Nov 3 16:01:32 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: NGRAM(body,XBDY) -> 20413 items, max length=61, (total 232 KB)
Nov 3 16:01:43 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: Error: FTS Xapian: Memory error 'std::bad_alloc'
Nov 3 16:01:44 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Refreshing after 289883 ms (vs 300000) and 19 updates (vs 1000000) and 3294 KB ...
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Committed 'refreshing' in 1112 ms
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: Error: Mailbox FolderA: Mail search failed: Internal error occurred. Refer to server log for more information. [2020-11-03 08:45:59]
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: fts_backend_xapian_update_set_mailbox
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Opening DB (RW) /var/vmail/example.com/aaaaa/xapian-indexes/db_7f5af7ba291b2df1a11d573bdb55d7e9
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Opening DB (RW) /var/vmail/example.com/aaaaa/xapian-indexes/db_7f5af7ba291b2df1a11d573bdb55d7e9 : Done
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Do expunge 'FolderA'
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Query= expungeheader:"1"
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Expunging 'FolderA' : 0 to do, doing 0
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Committed 'Expunging 'FolderA' done' in 1 ms
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Done indexing 'FolderA' (7f5af7ba291b2df1a11d573bdb55d7e9) (24004 msgs in 26145749 ms, rate: 0.9)
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Committed 'unset_box' in 1 ms
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Box is empty
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: fts_backend_xapian_update_deinit (/var/vmail/example.com/aaaaa/xapian-indexes)
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Committed 'update_deinit' in 0 ms
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: Error: Mailbox FolderA: Transaction commit failed: FTS transaction commit failed: transaction context (attempted to index 24004 messages (UIDs 625515..656331))
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><aI4JLccYoV8ZPQAA+4JX8g>: FTS Xapian: Deinit /var/vmail/example.com/aaaaa/xapian-indexes
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: fts_backend_xapian_init
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: Starting with partial=4 full=20 attachments=0 verbose=1
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: fts_backend_xapian_get_last_uid
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: Error: FTS Xapian: Can not open RO index (FolderT) /var/vmail/example.com/aaaaa/xapian-indexes/db_29721fb8b6c84024872dd3620d3a2ec9 : DatabaseOpeningError - No such file or directory
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: GetLastUID: Can not open db RO (/var/vmail/example.com/aaaaa/xapian-indexes/db_29721fb8b6c84024872dd3620d3a2ec9)
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: fts_backend_xapian_get_last_uid
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: Error: FTS Xapian: Can not open RO index (FolderT) /var/vmail/example.com/aaaaa/xapian-indexes/db_29721fb8b6c84024872dd3620d3a2ec9 : DatabaseOpeningError - No such file or directory
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: GetLastUID: Can not open db RO (/var/vmail/example.com/aaaaa/xapian-indexes/db_29721fb8b6c84024872dd3620d3a2ec9)
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: fts_backend_update_context
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: fts_backend_xapian_update_set_mailbox
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: Start indexing 'FolderT' (29721fb8b6c84024872dd3620d3a2ec9)
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: Opening DB (RW) /var/vmail/example.com/aaaaa/xapian-indexes/db_29721fb8b6c84024872dd3620d3a2ec9
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: Opening DB (RW) /var/vmail/example.com/aaaaa/xapian-indexes/db_29721fb8b6c84024872dd3620d3a2ec9 : Done
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: Query= uid:"9"
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: Ngram(XMID) -> 6 items (total 0 KB)
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: Query= uid:"9"
Nov 3 16:01:45 fawkes dovecot: indexer-worker(aaaaa@example.com)<15641><6IogK+l+oV8ZPQAA+4JX8g>: FTS Xapian: Ngram(A) -> 117 items (total 0 KB)
...
I've just switched back to solr and thought I'd give my 2p.
While fts-xapian was faster on fetching results, fts-solr is significantly faster on doing a full index. With an average account size of 3.4GB, fts-solr took just over 11 minutes avg per user to do a full index. The resultant index size of 400MB avg. By contrast, fts-xapian took almost 5 hours avg per user, with a resulant average index size almost equal to the account size. This is a massive difference.
I am using xapian with mailman3 and have been impressed with its speed, which is why I thought I'd use fts-xapian - its certainly easier to configure. Could something be wrong with my configuration (see comment above)?
Yes, it is true that initial indexing takes time and memory. (see Readme)
However, once the first index is passed, it is very fast for daily use by your users, and very easy maintenance.
I still see some errors in the log above. I will to try to dig into that. If you are still interested, we shall continue testing this scenario of low memory installation.
note : in my production server, I put vsx_limit to 0 (disable, works like a charm (on a 64GB server)
Hi guys, fts-xapian is awesome, so much easier to set up. It's working with all the accounts except one. My one. I've got 50,000+ emails in "All Mail" folder. It's causing a crash. Below is from the mail.log. I tried to fine the core dump (it isn't in /var/crash). Any help would be much appreciated. Even if you could help me find the problem email I'm very happy to just delete it.
Cheers,
Rich