Open Philippe23 opened 1 month ago
Unfortunately, I didn't have debug symbols being installed, so I just have the callstacks. I think I've installed debug symbols for the program at this point, so if/when it happens again, I may have some more info.
Callstack:
Core was generated by `/usr/libexec/dovecot/indexer-worker'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007ffb1aa8735c in ?? () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7ffb17e006c0 (LWP 12050))]
(gdb) bt
#0 0x00007ffb1aa8735c in ?? () from /lib64/libc.so.6
#1 0x00007ffb1aa371f6 in raise () from /lib64/libc.so.6
#2 0x00007ffb1aa1f8f7 in abort () from /lib64/libc.so.6
#3 0x00007ffb1a243f22 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#4 0x00007ffb1a25697c in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#5 0x00007ffb1a2569e7 in std::terminate() () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#6 0x00007ffb1a256c47 in __cxa_throw () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#7 0x00007ffb1a2439cc in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#8 0x00007ffb180ff985 in ?? () from /usr/lib64/libxapian.so.30
#9 0x00007ffb180fddf8 in Xapian::Document::Internal::add_term(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int) () from /usr/lib64/libxapian.so.30
#10 0x00007ffb18318f75 in XDoc::create_document (this=this@entry=0x555c4fdf9010, verbose=0,
title=0x555c4fc2fb60 "DW #267 (Holding.Order Information,/home/USER1/.maildir/xapian-indexes/db_23392717807b795b6d200000d09efc50) - ") at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/basic_string.tcc:242
#11 0x00007ffb1832cdc9 in XDocsWriter::worker (this=0x555c4fc3d130)
at /usr/src/debug/net-mail/dovecot-fts-xapian-1.7.14/fts-xapian-1.7.14/src/fts-backend-xapian-functions.cpp:843
#12 0x00007ffb1a2876a3 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#13 0x00007ffb1aa85652 in ?? () from /lib64/libc.so.6
#14 0x00007ffb1ab018ac in ?? () from /lib64/libc.so.6
#11 0x00007ffb1832cdc9 in XDocsWriter::worker (this=0x555c4fc3d130)
at /usr/src/debug/net-mail/dovecot-fts-xapian-1.7.14/fts-xapian-1.7.14/src/fts-backend-xapian-functions.cpp:843
843 doc->create_document(verbose,title);
(gdb) info locals
start_time = 1722164636509
n = 3
err = false
i = 2
newdoc = 0
doc = 0x555c4fdf9010
(gdb) print *this
$9 = {dbpath = 0x555c4fbd5a50 "/home/USER1/.maildir/xapian-indexes/db_23392717807b795b6d200000d09efc50",
docs = 0x555c4fc1ed00, m = 0x555c4fbd0690, terminated = false, dbw = 0x555c4fb43280, verbose = 0,
batch = 0x555c4fbd06c8, totaldocs = 0x555c4fbd06c0, t = 0x555c4fc1de40,
title = 0x555c4fc2fb60 "DW #267 (Holding.Order Information,/home/USER1/.maildir/xapian-indexes/db_23392717807b795b6d200000d09efc50) - ", pos = 11}
(gdb) print *doc
$1 = {data = 0x7ffb17382010, strings = 0x555c4fdce990, headers = 0x555c4fb2a250, uid = 11964, size = 15,
stems = 50001, tsize = 9878, uterm = 0x555c4fbc1bb0 "Q11964", xdoc = 0x7ffadae9c7b0}
strings
& headers
are both empty std::vector
s.
(gdb) list
838 if(verbose>0) syslog(LOG_INFO,"%sProcessing #%ld (%ld/%ld) %s",title,doc->uid,i+1,n,doc->getSummary().c_str());
839 pos=10;
840 doc->populate_stems(verbose,title);
841 if(verbose>0) syslog(LOG_INFO,"%sCreating doc #%ld (%ld/%ld) %s",title,doc->uid,i+1,n,doc->getSummary().c_str());
842 pos=11;
843 doc->create_document(verbose,title);
844 if(verbose>0) syslog(LOG_INFO,"%sPushing Doc %ld (%ld/%ld) with %ld strings and %ld stems",title,doc->uid,i+1,n,doc->size,doc->stems);
845 if(doc->stems > 0)
846 {
847 pos=12;
Logs:
Jul 28 06:48:27 myhostname dovecot[2176]: indexer-worker: Error: cannot allocate memory for thread-local data: ABORT
Jul 28 06:48:27 myhostname dovecot[2176]: indexer-worker(USER1)<8623><84I/VkwedsMmTRZ9:INRmDpohpmavIQAA0J78UA>: Fatal: master: service(indexer-worker): child 8623 returned error 127
Jul 28 06:48:30 myhostname dovecot[2176]: indexer-worker(USER1)<8886><84I/VkwedsMmTRZ9:aD3YLPshpma2IgAA0J78UA>: FTS Xapian: Waiting for all pending documents to be processed (Sleep5)
Jul 28 07:05:11 myhostname dovecot[2176]: indexer-worker(USER1)<11426><84I/VkwedsMmTRZ9:mFZODjolpmaiLAAA0J78UA>: Fatal: master: service(indexer-worker): child 11426 killed with signal 6 (core dumped)
Footnote: There seem to be a few different PIDs. Not sure if that's expected, or if I'm grabbing a few issues, but only one's creating a core. (?)
I'm realizing I don't have Xapian's debug symbols either. Gonna' rebuild that as well with detached debug symbols.
I should point out that I've got 11.5 gigs of swap free.
And I'm linking against Xapian v1.4.25.
Core was generated by `/usr/libexec/dovecot/indexer-worker'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f3c0af2c35c in ?? () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7f3c0785b6c0 (LWP 27298))]
(gdb) bt
#0 0x00007f3c0af2c35c in ?? () from /lib64/libc.so.6
#1 0x00007f3c0aedc1f6 in raise () from /lib64/libc.so.6
#2 0x00007f3c0aec48f7 in abort () from /lib64/libc.so.6
#3 0x00007f3c0a6e8f22 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#4 0x00007f3c0a6fb97c in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#5 0x00007f3c0a6fb9e7 in std::terminate() () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#6 0x00007f3c0a6fbc47 in __cxa_throw () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#7 0x00007f3c0a6e89cc in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#8 0x00007f3c085a49a5 in std::__new_allocator<std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> > >::allocate (this=<optimized out>, __n=1)
at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/new_allocator.h:151
#9 std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> > > >::allocate (__n=1, __a=...)
at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/alloc_traits.h:482
#10 std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> > >::_M_get_node (this=0x7f3bd7bd8880) at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/stl_tree.h:563
#11 std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> > >::_M_create_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, OmDocumentTerm> > (this=0x7f3bd7bd8880) at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/stl_tree.h:613
#12 std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> > >::_Auto_node::_Auto_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, OmDocumentTerm> > (__t=..., this=<synthetic pointer>)
at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/stl_tree.h:1637
#13 std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> > >::_M_emplace_hint_unique<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, OmDocumentTerm> > (this=this@entry=0x7f3bd7bd8880, __pos=__pos@entry={...})
at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/stl_tree.h:2462
#14 0x00007f3c085a2e18 in std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, OmDocumentTerm, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> > >::emplace_hint<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, OmDocumentTerm> > (__pos=..., this=0x7f3bd7bd8880)
at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/stl_map.h:640
#15 std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, OmDocumentTerm, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> > >::insert<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, OmDocumentTerm> > (__x=..., this=0x7f3bd7bd8880)
at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/stl_map.h:860
#16 Xapian::Document::Internal::add_term (this=0x7f3bd7bd8810, tname=..., wdfinc=<optimized out>) at api/omdocument.cc:502
#17 0x00007f3c085a2f3b in Xapian::Document::add_term (this=<optimized out>, tname=..., wdfinc=<optimized out>) at api/omdocument.cc:146
#18 0x00007f3c087bdf75 in XDoc::create_document (this=this@entry=0x5650a507bec0, verbose=0, title=0x5650a500c330 "DW #253 (Holding.Order Information,/home/USER1/.maildir/xapian-indexes/db_23392717807b795b6d200000d09efc50) - ")
at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/basic_string.tcc:242
#19 0x00007f3c087d1dc9 in XDocsWriter::worker (this=0x5650a51c1b30) at /usr/src/debug/net-mail/dovecot-fts-xapian-1.7.14/fts-xapian-1.7.14/src/fts-backend-xapian-functions.cpp:843
#20 0x00007f3c0a72c6a3 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#21 0x00007f3c0af2a652 in ?? () from /lib64/libc.so.6
#22 0x00007f3c0afa68ac in ?? () from /lib64/libc.so.6
(gdb) up
#16 Xapian::Document::Internal::add_term (this=0x7f3bd7bd8810, tname=..., wdfinc=<optimized out>) at api/omdocument.cc:502
502 terms.insert(make_pair(tname, std::move(newterm)));
(gdb) list
497 map<string, OmDocumentTerm>::iterator i;
498 i = terms.find(tname);
499 if (i == terms.end()) {
500 ++termlist_size;
501 OmDocumentTerm newterm(wdfinc);
502 terms.insert(make_pair(tname, std::move(newterm)));
503 } else {
504 if (i->second.increase_wdf(wdfinc))
505 ++termlist_size;
506 }
(gdb) print terms
$4 = std::map with 40230 elements = {["Q11923"] **snip**
#8 0x00007f3c085a49a5 in std::__new_allocator<std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, OmDocumentTerm> > >::allocate (this=<optimized out>, __n=1)
at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/new_allocator.h:151
151 return static_cast<_Tp*>(_GLIBCXX_OPERATOR_NEW(__n * sizeof(_Tp)));
(gdb) print __n
$12 = 1
(gdb) print sizeof(_Tp)
$13 = 96
So it's got a good size document going (40k of terms), but the allocation isn't huge, just 96 bytes.
Weird. For your last comment : 96 bytes for 40k elements ? Meaning most, if not all, are empty ?
96 bytes for 40k elements ? Meaning most, if not all, are empty ?
No, the 96 bytes appears to be just a node in a data structure (red-black tree). So it's a straw-that-broke-the-camel's-back situation.
My config has vsz_limit = 1G
, so I'm guessing that dovecot proper enforces this memory cap, and we're hitting that cap. (Also, this is why having all that free swap at a system-level doesn't matter.)
Since I don't have any 1G messages, perhaps there's a memory leak?
I'll also try raising vsz_limit
a little and see if it keeps happening.
But feel free to suggest other things to try to help track down what's going on.
Also, I'm not sure if there's a way for you to catch the allocation exception and recover from it.
1G for 40k terms, this seems to much, as each term can not be more than than 245 chars
I found a memory leaj. Can you try latest git ?
Upgraded to latest, and returned to the 1G limit I had before. I'll keep an eye on it for a few days and let you know if I see it again or not.
Not sure if it's related or a new crash, but I just got this twice:
Logs:
Aug 6 15:58:18 myhostname xapian-docswriter[23673]: DW #264 (Archived Trash.2024,/home/USER1/.maildir/xapian-indexes/db_c3bf2b05772d8066180f0000d09efc50) - Can't add document2 : std::exception
Aug 6 15:58:18 myhostname xapian-docswriter[23673]: DW #264 (Archived Trash.2024,/home/USER1/.maildir/xapian-indexes/db_c3bf2b05772d8066180f0000d09efc50) - Retrying (/home/USER1/.maildir/xapian-indexes/db_c3bf2b05772d8066180f0000d09efc50) from std::exception
Aug 6 15:58:18 myhostname kernel: indexer-worker[23957]: segfault at 9 ip 00007fb8036b3d28 sp 00007fb8028f8ae0 error 4 in libxapian.so.30.12.6[7fb8035a4000+1ba000] likely on CPU 1 (core 0, socket 0)
Aug 6 15:58:18 myhostname kernel: Code: 5c e9 bc 12 ef ff 0f 1f 40 00 5b 5d 41 5c c3 90 66 90 f3 0f 1e fa 41 57 49 89 ff 41 56 41 55 41 54 49 89 f4 55 53 48 83 ec 18 <0f> b7 6f 09 48 89 4c 24 08 66 c1 c5 08 0f b7 ed 83 fa ff 0f 84 ff
Aug 6 15:58:59 myhostname dovecot[10243]: indexer-worker(USER1)<23673><yjsoLwkfA8EmTRZ9:a6d9Hw2AsmZ5XAAA0J78UA>: Fatal: master: service(indexer-worker): child 23673 killed with signal 11 (core dumped)
Backtrace:
Core was generated by `/usr/libexec/dovecot/indexer-worker'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fb8036b3d28 in do_bswap (value=<optimized out>) at ./common/wordaccess.h:47
47 return __builtin_bswap16(value);
[Current thread is 1 (Thread 0x7fb8028f96c0 (LWP 23957))]
(gdb) bt
#0 0x00007fb8036b3d28 in do_bswap (value=<optimized out>) at ./common/wordaccess.h:47
#1 do_unaligned_read<unsigned short> (ptr=0x9 <error: Cannot access memory at address 0x9>) at ./common/wordaccess.h:122
#2 unaligned_read2 (ptr=0x9 <error: Cannot access memory at address 0x9>) at ./common/wordaccess.h:165
#3 Glass::DIR_END (b=0x0) at backends/glass/glass_table.h:114
#4 GlassTable::find_in_leaf (p=0x0, item=..., c=11, exact=exact@entry=@0x7fb8028f8b37: false) at backends/glass/glass_table.cc:428
#5 0x00007fb8036b5d24 in GlassTable::find (this=this@entry=0x5645fa45c530, C_=C_@entry=0x5645fa45c5f8) at backends/glass/glass_table.h:175
#6 0x00007fb8036b5d85 in GlassTable::key_exists (this=this@entry=0x5645fa45c530, key=<incomplete sequence \340>) at backends/glass/glass_table.cc:1505
#7 0x00007fb8036ab274 in GlassPostListTable::merge_doclen_changes (this=this@entry=0x5645fa45c530, doclens=std::map with 766 elements = {...}) at backends/glass/glass_postlist.cc:1149
#8 0x00007fb8036a2470 in Inverter::flush_doclengths (table=..., this=0x5645fa45d130) at backends/glass/glass_inverter.cc:158
#9 Inverter::flush (this=this@entry=0x5645fa45d130, table=...) at backends/glass/glass_inverter.cc:218
#10 0x00007fb803692f9f in GlassWritableDatabase::flush_postlist_changes (this=0x5645fa45c160) at backends/glass/glass_database.cc:1064
#11 0x00007fb803693ba5 in GlassWritableDatabase::commit (this=0x5645fa45c160) at backends/glass/glass_database.cc:1043
#12 0x00007fb8035ebf81 in Xapian::WritableDatabase::commit (this=0x5645fa3e00c0) at api/omdatabase.cc:864
#13 0x00007fb80382005a in XDocsWriter::worker (this=0x5645fa513450) at /usr/src/debug/net-mail/dovecot-fts-xapian-9999/dovecot-fts-xapian-9999/src/fts-backend-xapian-functions.cpp:908
#14 0x00007fb80577a6a3 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#15 0x00007fb805f78652 in ?? () from /lib64/libc.so.6
#16 0x00007fb805ff48ac in ?? () from /lib64/libc.so.6
Code for frame #13 + data I can find from the locals:
(gdb) list
903 {
904 pos=16;
905 syslog(LOG_WARNING,"%s Retrying (%s) from %s",title,dbpath,err_s.c_str());
906 try
907 {
908 dbw->commit();
909 pos=17;
910 dbw->replace_document(doc->uterm,*(doc->xdoc));
911 (*totaldocs)++;
912 }
913 catch(Xapian::Error e)
914 {
915 syslog(LOG_ERR,"%sCan't add document3 : %s - %s %s",title,e.get_type(),e.get_msg().c_str(),e.get_error_string());
916 }
917 catch(std::exception e)
918 {
919 syslog(LOG_ERR,"%sCan't add document4 : %s",title,e.what());
920 }
921 pos=18;
922 }
(gdb) info locals
lck = {_M_device = 0x5645fa40f400, _M_owns = true}
err_s = "std::exception"
start_time = 1722974297079
n = 3
err = <optimized out>
i = <optimized out>
newdoc = 2
doc = 0x5645fa5a80f0
(gdb) print *doc
$1 = {data = 0x0, strings = 0x5645fa5e1900, headers = 0x5645fa46b910, uid = 769, size = 17, stems = 28517, tsize = 9819, uterm = 0x5645fa54fdb0 "Q769", xdoc = 0x5645fa51d520}
gdb) print *doc->strings
$2 = std::vector of length 0, capacity 32
(gdb) print *doc->headers
$3 = std::vector of length 0, capacity 32
Let me know if you want me to open a different issue for this one.
Oh, and if it's relevant, I'm linked against Xapian 1.4.25.
This seems to be either another memory leak, or a low memory on your server.
How much memory yo allocated to the indexer ?
Can you provide the few lines before "Can't add document2" ?
please also use latest git
Can you provide the few lines before "Can't add document2" ?
867 #pragma GCC diagnostic push
868 #pragma GCC diagnostic ignored "-Wdeprecated-declarations"
869 while(!(lck.try_lock_for(std::chrono::milliseconds(1000 + std::rand() % 1000))))
870 {
871 if(verbose>1) syslog(LOG_INFO,"%sMutex : Waiting unlock... (thread)",title);
872 }
873 #pragma GCC diagnostic pop
874 if(verbose>1) syslog(LOG_INFO,"%sMutex : Lock acquired (thread)",title);
875 std::string err_s; err_s.clear();
876 try
877 {
878 pos=13;
879 dbw->replace_document(doc->uterm,*(doc->xdoc));
880 (*totaldocs)++;
881 (*batch)++;
882 if((*batch) > XAPIAN_WRITING_CACHE)
883 {
884 syslog(LOG_INFO,"%s Committing %ld docs (vs %ld limit)",title,(*batch),XAPIAN_WRITING_CACHE );
885 pos=15;
886 dbw->commit();
887 *batch = 0;
888 }
889 }
890 catch(Xapian::Error e)
891 { 892 syslog(LOG_WARNING,"%sCan't add document1 : %s - %s %s",title,e.get_type(), e.get_msg().c_str(),e.get_error_string());
893 err=true;
894 err_s.append(e.get_type());
895 }
896 catch(std::exception e)
897 {
898 syslog(LOG_WARNING,"%sCan't add document2 : %s",title,e.what());
899 err=true;
900 err_s.append(e.what());
901 }
902 if(err)
903 {
904 pos=16;
905 syslog(LOG_WARNING,"%s Retrying (%s) from %s",title,dbpath,err_s.c_str());
906 try
907 {
908 dbw->commit();
909 pos=17;
910 dbw->replace_document(doc->uterm,*(doc->xdoc));
911 (*totaldocs)++;
912 }
913 catch(Xapian::Error e)
914 { 915 syslog(LOG_ERR,"%sCan't add document3 : %s - %s %s",title,e.get_type(), e.get_msg().c_str(),e.get_error_string());
916 }
917 catch(std::exception e)
918 {
919 syslog(LOG_ERR,"%sCan't add document4 : %s",title,e.what());
920 }
921 pos=18;
922 }
923 }
924 pos=19;
925 delete(doc);
926 newdoc++;
927 } 928 if(verbose>0) syslog(LOG_INFO,"%sWrote %ld new docs in %ld ms",title,newdoc,fts_backend_xapian_curr ent_time() - start_time);
The additional context you requested before I upgrade to latest.
I meant, can you redo your test with the latest git ?
Any crash ?
I haven't had a crash since August 6th. I've been running the latest since August 13th. (August 13th is also the last time I've restarted my dovecot process.)
Core was generated by `/usr/libexec/dovecot/indexer-worker'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007efe631dcd28 in do_bswap (value=<optimized out>)
at ./common/wordaccess.h:47
47 return __builtin_bswap16(value);
[Current thread is 1 (Thread 0x7efe617116c0 (LWP 5949))]
(gdb) bt
#0 0x00007efe631dcd28 in do_bswap (value=<optimized out>) at ./common/wordaccess.h:47
#1 do_unaligned_read<unsigned short> (ptr=0x9 <error: Cannot access memory at address 0x9>) at ./common/wordaccess.h:122
#2 unaligned_read2 (ptr=0x9 <error: Cannot access memory at address 0x9>) at ./common/wordaccess.h:165
#3 Glass::DIR_END (b=0x0) at backends/glass/glass_table.h:114
#4 GlassTable::find_in_leaf (p=0x0, item=..., c=-1, exact=exact@entry=@0x7efe61710a87: false) at backends/glass/glass_table.cc:428
#5 0x00007efe631ded24 in GlassTable::find (this=0x7efe54046090, C_=0x7efe28bac598) at backends/glass/glass_table.h:175
#6 0x00007efe631b7488 in GlassCursor::find_entry (this=0x7efe2908e230, key="Q4238") at backends/glass/glass_cursor.cc:164
#7 0x00007efe631d0c1d in GlassPostList::init (this=this@entry=0x7efe5c1777a0) at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/unique_ptr.h:199
#8 0x00007efe631d1064 in GlassPostList::GlassPostList (this=this@entry=0x7efe5c1777a0, this_db_=..., term_="Q4238", keep_reference=keep_reference@entry=true) at backends/glass/glass_postlist.cc:708
#9 0x00007efe631ba827 in GlassWritableDatabase::open_post_list (this=0x7efe54045cc0, tname="Q4238") at ./include/xapian/intrusive_ptr.h:155
#10 0x00007efe6314f7ad in Xapian::Database::Internal::replace_document (this=0x7efe54045cc0, unique_term=..., document=...) at backends/database.cc:197
#11 0x00007efe63348f6a in XDocsWriter::worker (this=0x7efe5c0f0340) at /usr/src/debug/net-mail/dovecot-fts-xapian-9999/dovecot-fts-xapian-9999/src/fts-backend-xapian-functions.cpp:1025
#12 0x00007efe652a36a3 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6
#13 0x00007efe65aa1652 in ?? () from /lib64/libc.so.6
#14 0x00007efe65b1d8ac in ?? () from /lib64/libc.so.6
Aug 24 07:31:55 myhostname dovecot[4322]: indexer-worker(USER7)<5932><Er+FJaXEyWYoFwAA0J78UA:6JAaNaXEyWYsFwAA0J78UA>: FTS Xapian: Waiting for all pending documents (1) to be processed (Sleep5) with 2 threads
Aug 24 07:34:12 myhostname xapian-docswriter[5932]: DW #2 (Sent,/home/USER1/.maildir/xapian-indexes/db_90751f2b357a795b611b0000d09efc50) - Can't write doc2 : std::exception
Aug 24 07:34:12 myhostname xapian-docswriter[5932]: FTS Xapian: Can not allocate ICU translator (2)
Aug 24 07:34:12 myhostname kernel: indexer-worker[5949]: segfault at 9 ip 00007efe631dcd28 sp 00007efe61710a30 error 4 in libxapian.so.30.12.6[7efe630cd000+1ba000] likely on CPU 1 (core 0, socket 0)
Aug 24 07:34:12 myhostname kernel: Code: 5c e9 bc 12 ef ff 0f 1f 40 00 5b 5d 41 5c c3 90 66 90 f3 0f 1e fa 41 57 49 89 ff 41 56 41 55 41 54 49 89 f4 55 53 48 83 ec 18 <0f> b7 6f 09 48 89 4c 24 08 66 c1 c5 08 0f b7 ed 83 fa ff 0f 84 ff
Aug 24 07:34:51 myhostname dovecot[4322]: indexer-worker(USER1)<5932><6Lv6JGwgWs8mTRZ9:0D63G7vEyWYsFwAA0J78UA>: Fatal: master: service(indexer-worker): child 5932 killed with signal 11 (core dumped)
I suspect this one is an upstream bug.
This looks like a memory leak
Could be -- certainly happened when memory was low, but it doesn't look like you're directly causing the crash from what I see. All the pointers you're passing in look like they're non-NULL from what I see. So my gut says that it's upstream not handling an allocation failure properly. I guess, "why not both?" could apply here.
Kindly set verbose=1 and use latest git
Latest has a ton of warnings like this:
fts-backend-xapian-functions.cpp:951:83: warning: format not a string literal and no format arguments [-Wformat-security]
951 | if(verbose>0) { s=title; s.append("No-op"); syslog(LOG_INFO,s.c_str()); }
| ~~~~~~^~~~~~~~~~~~~~~~~~~~
You should do these like this 👇 if you're just trying to output a code-built string:
syslog(LOG_INFO, "%s", s.c_str());
// ^^^^
Otherwise, you could have a %
format character in the string you're trying to output and the library will assume there's an arg passed that contains what it should read (but there isn't).
With the "%s"
format string you're saying, output this string arg as-is.
Made a PR to fix 'em for you: https://github.com/grosjo/fts-xapian/pull/171
I'm seeing some core dumps from 1.7.14.
Logs that reference the PID from the core:
A different trace:
Logs:
Another one:
Logs:
I also saw this once in my logs, but didn't find a core for the specific process: