grosjo / fts-xapian

Dovecot FTS plugin based on Xapian
GNU Lesser General Public License v2.1
91 stars 19 forks source link

Error: cannot allocate memory for thread-local data: ABORT #107

Closed ghost closed 2 years ago

ghost commented 2 years ago

Hello,

I upgraded some servers with xapian version 1.5.1 and I'm still getting memory errors:

Nov 15 10:43:07 server dovecot[8940]: indexer-worker: Error: cannot allocate memory for thread-local data: ABORT
Nov 15 10:43:07 server dovecot[8940]: indexer: Error: Indexer worker disconnected, discarding 1 requests for email@domain.tld
Nov 15 10:43:07 server dovecot[8940]: indexer-worker(email@domain.tld)<8974><azCfCbo5kmEMIwAAD78hag:azCfCbo5kmEMIwAAD78hag>: Fatal: master: service(indexer-worker): child 8974 returned error 127

This happens randomly in most emails, except one in particular were it happens all the time.

Any suggestions please?

grosjo commented 2 years ago

Can you put "verbose=1" and send the log just before the crash ?

ghost commented 2 years ago

Today I noticed even more errors:

indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
indexer-worker: Error:   what():  std::bad_alloc
indexer: Error: Indexer worker disconnected, discarding 1 requests for user@domain.tld

Where would I put "verbose=1" please?

grosjo commented 2 years ago

in the dovecot.conf file

fts_xapian = partial=X full=Y verbose=1

grosjo commented 2 years ago

@Bill-ColeUS Have you been able to get the log ?

ghost commented 2 years ago

Here you go:

xapiandebug.zip

grosjo commented 2 years ago

Can you kindly try with latest git, and set verbose=2 ?

ghost commented 2 years ago

Unfortunately I can't compile and run things on the live server, I can only get updates via EPEL rpm packages.

I noticed that the dovecot.index.cache file is 141678872 bytes (about 141MB) and the server has 4GB of ram.

grosjo commented 2 years ago

Can you share the output of doveconf -n ?

grosjo commented 2 years ago

and can you still try with "verbose=2" on the version you are using ?

grosjo commented 2 years ago

@Bill-ColeUS Any news on that ? Thank you so much

ghost commented 2 years ago

here we go: doveconf -n output:

doveconfn.txt

ghost commented 2 years ago

dovecot with verbose=2 output:

xapiandebug.txt

grosjo commented 2 years ago

Thanks @Bill-ColeUS

in the conf, can you add

service indexer-worker { vsz_limit = 0 }

Also, the log file does not show a crash sentence. Can you try to catch the log a few dozens of lines before a crash ?

ghost commented 2 years ago

I have added verbose=2 and vsz_limit=0. I will now monitor the server and report any unusual.

We may need to wait for a few days to catch a crash, hopefully it won't take long.

ghost commented 2 years ago

No fatal errors so far, but I do see tons of these:

Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: Warning: FTS Xapian: Free memory 1089 MB vs 200 MB minimum
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: fts_backend_xapian_index_text
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: fts_backend_xapian_query
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: NGRAM(body,XBDY) -> 1140 items, max length=32, (total 13 KB)
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: Indexing part as text
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: fts_backend_xapian_update_unset_build_key with 21821 docs in the index
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: fts_backend_xapian_update_set_build_key
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: New part (Header=Content-Type,Type=(null),Disposition=(null))
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: Unknown header 'contenttype' of part
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: fts_backend_xapian_update_set_build_key
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: New part (Header=Content-Transfer-Encoding,Type=(null),Disposition=(null))
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: Unknown header 'contenttransferencoding' of part
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: fts_backend_xapian_update_set_build_key
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: New part (Header=Content-Disposition,Type=(null),Disposition=(null))
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: Unknown header 'contentdisposition' of part
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: fts_backend_xapian_update_set_build_key
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: New part (Header=(null),Type=text/html,Disposition=attachment; filename="ORDER149993.html")
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: Found part as attachment of type 'text/html' and disposition 'attachment; filename="ORDER149993.html"'
Nov 22 08:25:21 server dovecot[26346]: indexer-worker(hello@example.tld)<30976><Jm07HdE3m2H+eAAAD78hgA:mpywINE3m2EAeQAAD78hgA>: FTS Xapian: Indexing part as attachment (data like '
Nov 22 08:25:21 server dovecot[26346]: indexer-worker: Error:
Nov 22 08:25:21 server dovecot[26346]: indexer-worker: Error:
Nov 22 08:25:21 server dovecot[26346]: indexer-worker: Error:
Nov 22 08:25:21 server dovecot[26346]: indexer-worker: Error:
Nov 22 08:25:21 server dovecot[26346]: indexer-worker: Error:
Nov 22 08:25:21 server dovecot[26346]: indexer-worker: Error:
Nov 22 08:25:21 server dovecot[26346]: indexer-worker: Error:
Nov 22 08:25:21 server dovecot[26346]: indexer-worker: Error:
Nov 22 08:25:21 server dovecot[26346]: indexer-worker: Error:   � �
Nov 22 08:25:21 server dovecot[26346]: indexer-worker: Error:   #015')

There are lots of blank Error: lines and some that contain binary data.

grosjo commented 2 years ago

These are just display bugs, I fixed it in latest git.

Any crash ?

ghost commented 2 years ago

No crash so far...

maybe the vsz_limit = 0 option fixed the issue?

grosjo commented 2 years ago

and if you put vsz_limit = 350M ?

grosjo commented 2 years ago

Also, I pushed v1.5.2 in epel8 Kindly use the update

grosjo commented 2 years ago

https://koji.fedoraproject.org/koji/packageinfo?packageID=34417

ghost commented 2 years ago

Unfortunately I can't do any more experiments on this live server, since its working fine now, I'll leave it alone :)

Thank you for your help!