slusarz / dovecot-fts-flatcurve

Dovecot FTS Flatcurve plugin (Xapian)
https://slusarz.github.io/dovecot-fts-flatcurve/
GNU Lesser General Public License v2.1
40 stars 8 forks source link

Crash when rotating DB #26

Closed internethering closed 2 years ago

internethering commented 2 years ago

hey, is it normal, that the fts-storage need ~3 times more space?

du -sh /srv/vmail/<domain>/<local>/mdbox/storage
12G     /srv/vmail/<domain>/<local>/mdbox/storage
du -sh /srv/vmail/<domain>/<local>/mdbox/mailboxes/INBOX/dbox-Mails/fts-flatcurve
40G     /srv/vmail/<domain>/<local>/mdbox/mailboxes/INBOX/dbox-Mails/fts-flatcurve

here is my fts config:

plugin {
    fts = flatcurve
    fts_flatcurve_substring_search = no
    fts_autoindex = yes
    fts_enforced = yes
    fts_filters = normalizer-icu snowball stopwords
    fts_filters_en = lowercase snowball english-possessive stopwords
    fts_index_timeout = 60s
    fts_languages = en de
    fts_tokenizer_generic = algorithm=simple
    fts_tokenizers = generic email-address
    fts_decoder = decode2text
    fts_autoindex_exclude = \Junk
    fts_autoindex_exclude2 = \Trash
}

As comparison, the old lucene indizes of this mailbox:

# du -sh /srv/vmail/<domain>/<local>/mdbox/lucene-indexes
796M    /srv/vmail/<domain>/<local>/mdbox/lucene-indexes
slusarz commented 2 years ago

This sounds like a duplicate of #22 (which is a Dovecot issue, not a flatcurve issue).

internethering commented 2 years ago
    ### These are the two lines which cause the issue
    fts_header_excludes = *
    fts_header_includes = From To Cc Bcc Subject Message-ID

I don't have these options enabled. And I try to disable fts_filters* too. After some days of indexing:

77G /srv/vmail/<domain>/<local>/mdbox/mailboxes/INBOX/dbox-Mails/fts-flatcurve

When it's a dovecot issue, whey has the lucene index than only 796M?

internethering commented 2 years ago

Ok, I checked the logfiles and see hundrets of segfaults, this may cause the problem:

these are all from last two hours:

Fri 2022-05-06 16:04:42 CEST 1777919   0 65502 SIGABRT present  /usr/libexec/dovecot/indexer-worker 15.2M
Fri 2022-05-06 16:21:39 CEST 1834404   0 65502 SIGABRT present  /usr/libexec/dovecot/indexer-worker 16.1M
Fri 2022-05-06 16:29:23 CEST 1834401   0 65502 SIGSEGV present  /usr/libexec/dovecot/indexer-worker 39.9M
Fri 2022-05-06 16:32:13 CEST 1979022   0 65502 SIGABRT present  /usr/libexec/dovecot/indexer-worker 15.8M
Fri 2022-05-06 16:36:58 CEST 1834405   0 65502 SIGABRT present  /usr/libexec/dovecot/indexer-worker 19.9M
Fri 2022-05-06 16:42:03 CEST 2080176   0 65502 SIGABRT present  /usr/libexec/dovecot/indexer-worker 16.6M
Fri 2022-05-06 16:53:26 CEST 2139771   0 65502 SIGABRT present  /usr/libexec/dovecot/indexer-worker 15.2M
Fri 2022-05-06 17:04:48 CEST 2169852   0 65502 SIGSEGV present  /usr/libexec/dovecot/indexer-worker 39.5M
Fri 2022-05-06 17:14:55 CEST 2214940   0 65502 SIGABRT present  /usr/libexec/dovecot/indexer-worker 15.6M
Fri 2022-05-06 17:19:21 CEST 2214947   0 65502 SIGSEGV present  /usr/libexec/dovecot/indexer-worker 40.6M
Fri 2022-05-06 17:24:59 CEST 2282140   0 65502 SIGABRT present  /usr/libexec/dovecot/indexer-worker 18.4M
Fri 2022-05-06 17:27:55 CEST 2308150   0 65502 SIGSEGV present  /usr/libexec/dovecot/indexer-worker 43.2M
Fri 2022-05-06 17:41:06 CEST 2362289   0 65502 SIGABRT present  /usr/libexec/dovecot/indexer-worker 15.7M
Fri 2022-05-06 17:51:35 CEST 2392627   0 65502 SIGSEGV present  /usr/libexec/dovecot/indexer-worker 25.2M

here the backtrace:

# coredumpctl debug 1834401
           PID: 1834401 (indexer-worker)
           UID: 0 (root)
           GID: 65502 (vmail)
        Signal: 11 (SEGV)
     Timestamp: Fri 2022-05-06 16:29:22 CEST (1h 34min ago)
  Command Line: dovecot/indexer-worker
    Executable: /usr/libexec/dovecot/indexer-worker
 Control Group: /system.slice/dovecot.service
          Unit: dovecot.service
         Slice: system.slice
       Boot ID: 033175faa3794f36a444334c55b9c051
    Machine ID: 4a6906c2152dff421b9b5a385455c64e
      Hostname: alpha.mail
       Storage: /var/lib/systemd/coredump/core.indexer-worker.0.033175faa3794f36a444334c55b9c051.1834401.1651847362000000.zst (present)
     Disk Size: 39.9M
       Message: Process 1834401 (indexer-worker) of user 0 dumped core.

                Module linux-vdso.so.1 with build-id 937693c867f0e9c46128a13ae184019fba3161d3
                Module ISO8859-1.so without build-id.
                Module libuuid.so.1 without build-id.
                Module librt.so.1 without build-id.
                Module libxapian.so.30 without build-id.
                Module lib21_fts_flatcurve_plugin.so without build-id.
                Module liblz4.so.1 without build-id.
                Module liblzma.so.5 without build-id.
                Module libbz2.so.1 without build-id.
                Module libz.so.1 without build-id.
                Module lib20_zlib_plugin.so without build-id.
                Module lib20_virtual_plugin.so without build-id.
                Module lib20_replication_plugin.so without build-id.
                Module libdl.so.2 without build-id.
                Module libpthread.so.0 without build-id.
                Module libicudata.so.70 without build-id.
                Module libgcc_s.so.1 without build-id.
                Module libm.so.6 without build-id.
                Module libstdc++.so.6 without build-id.
                Module libicuuc.so.70 without build-id.
                Module libicui18n.so.70 without build-id.
                Module libexttextcat-2.0.so.0 without build-id.
                Module libstemmer.so.2 without build-id.
                Module lib20_fts_plugin.so without build-id.
                Module lib15_notify_plugin.so without build-id.
                Module lib11_trash_plugin.so without build-id.
                Module lib10_quota_plugin.so without build-id.
                Module lib01_acl_plugin.so without build-id.
                Module libc.so.6 without build-id.
                Module libdovecot.so.0 without build-id.
                Module libdovecot-storage.so.0 without build-id.
                Module indexer-worker without build-id.
                Stack trace of thread 1834401:
                #0  0x00007f6eac755956 fts_flatcurve_xapian_db_add (lib21_fts_flatcurve_plugin.so + 0xd956)
                #1  0x00007f6eac7547a5 fts_flatcurve_xapian_create_current (lib21_fts_flatcurve_plugin.so + 0xc7a5)
                #2  0x00007f6eac754a6b fts_flatcurve_xapian_close_db (lib21_fts_flatcurve_plugin.so + 0xca6b)
                #3  0x00007f6eac756c53 fts_flatcurve_xapian_init_msg (lib21_fts_flatcurve_plugin.so + 0xec53)
                #4  0x00007f6eac7517a3 fts_backend_flatcurve_update_set_build_key (lib21_fts_flatcurve_plugin.so + 0x97a3)
                #5  0x00007f6eaed1f0c1 fts_backend_update_set_build_key (lib20_fts_plugin.so + 0xc0c1)
                #6  0x00007f6eaed20723 fts_build_mail_header (lib20_fts_plugin.so + 0xd723)
                #7  0x00007f6eaed26716 fts_mail_index (lib20_fts_plugin.so + 0x13716)
                #8  0x00007f6eaf1b1c00 mail_precache (libdovecot-storage.so.0 + 0x53c00)
                #9  0x000055f3c11f2bba index_mailbox_precache (indexer-worker + 0x2bba)
                #10 0x00007f6eaf087743 connection_input_default (libdovecot.so.0 + 0xf7743)
                #11 0x00007f6eaf0a5ac8 io_loop_call_io (libdovecot.so.0 + 0x115ac8)
                #12 0x00007f6eaf0a71d2 io_loop_handler_run_internal (libdovecot.so.0 + 0x1171d2)
                #13 0x00007f6eaf0a5b71 io_loop_handler_run (libdovecot.so.0 + 0x115b71)
                #14 0x00007f6eaf0a5d30 io_loop_run (libdovecot.so.0 + 0x115d30)
                #15 0x00007f6eaf019ae3 master_service_run (libdovecot.so.0 + 0x89ae3)
                #16 0x000055f3c11f25ed main (indexer-worker + 0x25ed)
                #17 0x00007f6eaedbc2fa __libc_start_call_main (libc.so.6 + 0x292fa)
                #18 0x00007f6eaedbc3a8 __libc_start_main_impl (libc.so.6 + 0x293a8)
                #19 0x000055f3c11f26a1 _start (indexer-worker + 0x26a1)
                ELF object binary architecture: AMD x86-64

GNU gdb (Gentoo 12.1 vanilla) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/libexec/dovecot/indexer-worker...
Reading symbols from /usr/lib/debug//usr/libexec/dovecot/indexer-worker.debug...
[New LWP 1834401]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `dovecot/indexer-worker'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f6eac755956 in fts_flatcurve_xapian_db_add (backend=0x55f3c3118a10, dbpath=<optimized out>, type=<optimized out>, open_wdb=<optimized out>) at fts-backend-flatcurve-xapian.cpp:493
493                     hash_table_insert(x->dbs, newpath->fname, o);
(gdb) bt
#0  0x00007f6eac755956 in fts_flatcurve_xapian_db_add (backend=0x55f3c3118a10, dbpath=<optimized out>, type=<optimized out>, open_wdb=<optimized out>) at fts-backend-flatcurve-xapian.cpp:493
#1  0x00007f6eac7547a5 in fts_flatcurve_xapian_create_current (backend=0x55f3c3118a10, copts=0) at /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/basic_string.h:194
#2  0x00007f6eac754a6b in fts_flatcurve_xapian_close_db (backend=0x55f3c3118a10, xdb=0x55f3c339d4b8, opts=FLATCURVE_XAPIAN_DB_CLOSE_ROTATE) at fts-backend-flatcurve-xapian.cpp:1043
#3  0x00007f6eac756c53 in fts_flatcurve_xapian_init_msg (ctx=ctx@entry=0x55f3c2f4b528) at fts-backend-flatcurve-xapian.cpp:1199
#4  0x00007f6eac7517a3 in fts_backend_flatcurve_update_set_build_key (_ctx=0x55f3c2f4b528, key=0x7ffefd6135c0) at fts-backend-flatcurve.c:252
#5  0x00007f6eaed1f0c1 in fts_backend_update_set_build_key (ctx=0x55f3c2f4b528, key=key@entry=0x7ffefd6135c0) at fts-api.c:198
#6  0x00007f6eaed20723 in fts_build_mail_header (block=0x7ffefd613580, block=0x7ffefd613580, ctx=0x7ffefd6135f0) at fts-build-mail.c:160
#7  fts_build_mail_real (may_need_retry_r=0x7ffefd61351f, retriable_err_msg_r=0x7ffefd613530, mail=0x55f3c9b04418, update_ctx=0x55f3c2f4b528) at fts-build-mail.c:654
#8  fts_build_mail (update_ctx=0x55f3c2f4b528, mail=mail@entry=0x55f3c9b04418) at fts-build-mail.c:704
#9  0x00007f6eaed26716 in fts_mail_index (_mail=0x55f3c9b04418) at fts-storage.c:547
#10 fts_mail_precache (_mail=0x55f3c9b04418) at fts-storage.c:570
#11 0x00007f6eaf1b1c00 in mail_precache (mail=0x55f3c9b04418) at mail.c:519
#12 0x000055f3c11f2bba in index_mailbox_precache (box=0x55f3c2eea968, conn=0x55f3c2ec53e0) at master-connection.c:119
#13 index_mailbox (user=<optimized out>, user=<optimized out>, what=<optimized out>, max_recent_msgs=<optimized out>, mailbox=<optimized out>, conn=0x55f3c2ec53e0) at master-connection.c:238
#14 master_connection_input_args (_conn=0x55f3c2ec53e0, args=<optimized out>) at master-connection.c:284
#15 0x00007f6eaf087743 in connection_input_default (conn=0x55f3c2ec53e0) at connection.c:95
#16 0x00007f6eaf0a5ac8 in io_loop_call_io (io=0x55f3c3125320) at ioloop.c:737
#17 0x00007f6eaf0a71d2 in io_loop_handler_run_internal (ioloop=ioloop@entry=0x55f3c2e86ea0) at ioloop-epoll.c:222
#18 0x00007f6eaf0a5b71 in io_loop_handler_run (ioloop=0x55f3c2e86ea0) at ioloop.c:789
#19 0x00007f6eaf0a5d30 in io_loop_run (ioloop=0x55f3c2e86ea0) at ioloop.c:762
#20 0x00007f6eaf019ae3 in master_service_run (service=0x55f3c2e86d00, callback=callback@entry=0x55f3c11f2770 <client_connected>) at master-service.c:863
#21 0x000055f3c11f25ed in main (argc=<optimized out>, argv=<optimized out>) at indexer-worker.c:76

It's the productive system & I had to disable flatcurve completly, bad for testing at my site.

slusarz commented 2 years ago

Storage usage is not the issue here - that is the symptom of the crashes, which is leaving the DB in a damaged state so that future processes can't reuse the data that already exists.

Have you tried "doveadm fts-flatcurve remove " to remove the entire mailbox, and then doing a manual "doveadm index"?

Would need more information about the variables for the hash_table_insert() call - it seems all those variables exist, since they are used in various calls right before line 493. Printing the variables in gdb could assist in tracking this down.

slusarz commented 2 years ago

Closing due to no feedback and inability to reproduce in testing.