manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
8.87k stars 490 forks source link

Manticoresearch crashes with memory issues #2525

Open RyDoRe opened 3 weeks ago

RyDoRe commented 3 weeks ago

Hi there!

I already tried my luck in the forum and got redirected to create a Ticket here: https://forum.manticoresearch.com/t/malloc-issues-when-indexing-large-plain-tables/2069/7

Setup description:

I have a plain table setup using the main+delta schema. I use manticore in Docker containers, my source is an external mariadb and the manticore container(s) are running in a k8s cluster.

I have manticore setup as a statefulset in the kubernetes setup and use a local-path storage for the manticore data.

My Dockerfile for manticore uses a custom startup CMD script, that checks wether data is already there and if not executes a full indexing before starting the searchd process with the following command: 'searchd -c /etc/manticoresearch/manticore.conf --nodetach --logdebugv'

I have two indexer scripts. One that full indexes everything and one that executes only the delta indexes. In Both scripts I call a php script that dynamically rebuilds the manticore.conf file before executing the indexers, to ensure all sources are available.

My general manticore.conf has data like this. I added one Index example of multiple to give an idea of the setup:

indexer
{
    mem_limit       = 1024M

        lemmatizer_cache = 256M
        write_buffer = 256M
}
common
{
        lemmatizer_base = /var/lib/dict
}
searchd
{
        listen = 0.0.0.0:9313:sphinx
        listen = 0.0.0.0:9306:mysql
        listen = 0.0.0.0:9308:http
    log         = /var/lib/manticore/searchd.log
    network_timeout     = 5
    pid_file        = /var/run/manticore/searchd.pid
    seamless_rotate     = 0
    preopen_tables      = 1
        secondary_indexes       = 1
    unlink_old      = 1
        binlog_path = # disable logging

}
source db{
        type = mysql
        sql_host = <?php echo $mysqlhost . "\n"?>
        sql_user = <?php echo $mysqluser . "\n"?>
        sql_pass = <?php echo $mysqlpwd . "\n"?>
        sql_port = <?php echo $mysqlport . "\n"?>
        sql_query_pre = SET NAMES utf8
        sql_query_pre = SET SESSION query_cache_type=OFF
}

source db_<?php echo $row->ID?> : db {
        sql_db = database_<?php echo sprintf("%1$05d", $row->ID)?>

}

source main_table1_<?=$row->ID?> : db_<?=$row->ID?>
{
        sql_query_pre =         SET NAMES utf8
        sql_query_pre =         REPLACE INTO table.ManticoreTimeStamp SELECT <?=$row->ID?>,12, UNIX_TIMESTAMP()
        sql_query =               SELECT <QUERY_TO_SELECT>
                                          WHERE t.lastChange <= (SELECT lastChange FROM table.ManticoreTimeStamp WHERE counterID = 12 AND 
                                          ID_IN_TABLE =<?=$row->ID?> ) 
        sql_attr_uint =         table_id
        sql_attr_timestamp =    createTime
        sql_attr_timestamp =    lastAction
        sql_attr_uint =         ID_IN_TABLE
}

source delta_table1_<?=$row->ID?> : main_table1_<?=$row->ID?>
{
        sql_query_pre =         SET NAMES utf8
        sql_query =                SELECT <QUERY_TO_SELECT>
                                           WHERE t.createTime > (SELECT lastChange FROM table.ManticoreTimeStamp WHERE counterID = 12 AND 
                                           ID_IN_TABLE =<?=$row->ID?> )

}

index main_table1
{
        ngram_len = 1
        charset_table = non_cjk
        ngram_chars = cjk
        min_word_len = 3
        min_infix_len = 3
        html_strip = 1
        html_remove_elements = style, script //should strip content of tags
        morphology = lemmatize_de_all, lemmatize_en_all
<?php
foreach ($source_element as $row) {
    echo "source = main_table1_$row->ID\n";
}
?>
        path = /var/lib/manticore/main_table1
}

index delta_table1: main_table1
{
<?php
foreach ($source_element as $row) {
    echo "source = delta_table1_$row->ID\n";
}
?>
        path = /var/lib/manticore/delta_table1
        index_exact_words = 1
}

Bug description:

This setup generally worked already with small data amounts, but with large data it crashes reliably.

The initial indexing works without issues. It takes about two hours to run trough. and afterwards the searchd process gets started. After that, triggering any delta index crashes searchd and the whole container. I have the feeling that the rotate step somehow doesn't run trough reliable.

One of the last times i tried to execute the indexing I got more or less the following output:

DEBUG: will rotate delta_table1
DEBUG: will rotate delta_table2
DEBUG: will rotate delta_table3
DEBUG: will rotate delta_table4
DEBUG: will rotate delta_table5
DEBUG: will rotate main_table6
1900K … … … … … 100% 4.34M=0.2sdouble free or corruption (out)
DEBUG: TaskRotation starts with 6 deferred tables
DEBUG: seamless rotate local table delta_table1
rotating table ‘delta_table1’: started
Crash!!! Handling signal 6
DEBUG: prealloc enough RAM and lock new table
DEBUG: Locking the table via file /var/lib/manticore/delta_table1.new.spl
DEBUG: lock /var/lib/manticore/delta_table1.new.spl success
DEBUG: CSphIndex_VLN::Preread invoked ‘delta_table1’(/var/lib/manticore/delta_table1.new)
DEBUG: Preread successfully finished
DEBUG: activate new table
RW-idx for rename to .old, acquiring…
RW-idx for rename to .old, acquired…
DEBUG: rotating table ‘delta_table1’: applying other tables killlists
DEBUG: rotating table ‘delta_table1’: applying other tables killlists… DONE
DEBUG: rotating table ‘delta_table1’: apply killlist from this table to other tables (killlist_target)
DEBUG: rotating table ‘delta_table1’: apply killlist from this table to other tables (killlist_target)… DONE
DEBUG: all went fine; swap them
DEBUG: unlink /var/lib/manticore/delta_table1.old
DEBUG: Unlocking the table (lock /var/lib/manticore/delta_table1.old.spl)
DEBUG: File ID ok, closing lock FD 98, unlinking /var/lib/manticore/delta_table1.old.spl
rotating table ‘delta_table1’: success
DEBUG: seamless rotate local table delta_table2
rotating table ‘delta_table2’: started
DEBUG: prealloc enough RAM and lock new table
Crash!!! Handling signal 11

I also got a crash report with the following data:

------- FATAL: CRASH DUMP -------
[Thu Aug 22 11:37:50.425 2024] [   42]

--- crashed SphinxQL request dump ---
P �� )�
--- request dump end ---
--- local index:

Manticore 6.3.0 1811a9efb@24052209 (columnar 2.3.0 88a01c3@24052206) (secondary 2.3.0 88a01c3@24052206) (knn 2.3.0 88a01c3@24052206)
Handling signal 6
-------------- backtrace begins here ---------------
Program compiled with Clang 16.0.6
Configured with flags: Configured with these definitions: -DDISTR_BUILD=jammy -DUSE_SYSLOG=1 -DWITH_GALERA=1 -DWITH_RE2=1 -DWITH_RE2_FORCE_STATIC=1 -DWITH_STEMMER=1 -DWITH_STEMMER_FORCE_STATIC=1 -DWITH_NLJSON=1 -DWITH_UNIALGO=1 -DWITH_ICU=1 -DWITH_ICU_FORCE_STATIC=1 -DWITH_SSL=1 -DWITH_ZLIB=1 -DWITH_ZSTD=1 -DDL_ZSTD=1 -DZSTD_LIB=libzstd.so.1 -DWITH_CURL=1 -DDL_CURL=1 -DCURL_LIB=libcurl.so.4 -DWITH_ODBC=1 -DDL_ODBC=1 -DODBC_LIB=libodbc.so.2 -DWITH_EXPAT=1 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DWITH_ICONV=1 -DWITH_MYSQL=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.21 -DWITH_POSTGRESQL=1 -DDL_POSTGRESQL=1 -DPOSTGRESQL_LIB=libpq.so.5 -DLOCALDATADIR=/var/lib/manticore -DFULL_SHARE_DIR=/usr/share/manticore
Built on Linux x86_64 (jammy) (cross-compiled)
Stack bottom = 0x7f8c22018dc7, thread stack size = 0x20000
Trying manual backtrace:
Something wrong with thread stack, manual backtrace may be incorrect (fp=0x20000)
Wrong stack limit or frame pointer, manual backtrace failed (fp=0x20000, stack=0x7f8c22020000, stacksize=0x20000)
Trying system backtrace:
begin of system symbols:
searchd(_Z12sphBacktraceib+0x227)[0x55bde8cab057]
searchd(_ZN11CrashLogger11HandleCrashEi+0x364)[0x55bde8b21224]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f8c22060520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f8c220b49fc]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f8c22060476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f8c220467f3]
/lib/x86_64-linux-gnu/libc.so.6(+0x89676)[0x7f8c220a7676]
/lib/x86_64-linux-gnu/libc.so.6(+0xa0cfc)[0x7f8c220becfc]
/lib/x86_64-linux-gnu/libc.so.6(+0xa2e70)[0x7f8c220c0e70]
/lib/x86_64-linux-gnu/libc.so.6(free+0x73)[0x7f8c220c3453]
searchd(_ZN7Threads4Coro8Worker_cD2Ev+0x53)[0x55bde9cc5b93]
searchd(_ZN7Threads4Coro8Worker_c6ResumeEv+0x8f)[0x55bde9cc5a1f]
searchd(_ZN7Threads4Coro8Worker_c10DoCompleteEPvPNS_7details20SchedulerOperation_tE+0x1e)[0x55bde9cc574e]
searchd(_ZN7Threads9Service_t10do_run_oneER14CSphScopedLockI9CSphMutexERNS_23TaskServiceThreadInfo_tERSt6atomicIbE+0x178)[0x55bde8ea9078]
searchd(_ZN7Threads9Service_t3runERSt6atomicIbE+0xc1)[0x55bde8ea8e11]
searchd(_ZN7Threads12ThreadPool_c4loopEi+0x77)[0x55bde8ea8c87]
searchd(+0x12196e7)[0x55bde8ea76e7]
searchd(_Z20ThreadProcWrapper_fnPv+0x3e)[0x55bde8ea675e]
/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x7f8c220b2ac3]
/lib/x86_64-linux-gnu/libc.so.6(+0x126850)[0x7f8c22144850]
Trying boost backtrace:
0# sphBacktrace(int, bool) in searchd
1# CrashLogger::HandleCrash(int) in searchd
2# 0x00007F8C22060520 in /lib/x86_64-linux-gnu/libc.so.6
3# pthread_kill in /lib/x86_64-linux-gnu/libc.so.6
4# raise in /lib/x86_64-linux-gnu/libc.so.6
5# abort in /lib/x86_64-linux-gnu/libc.so.6
6# 0x00007F8C220A7676 in /lib/x86_64-linux-gnu/libc.so.6
7# 0x00007F8C220BECFC in /lib/x86_64-linux-gnu/libc.so.6
8# 0x00007F8C220C0E70 in /lib/x86_64-linux-gnu/libc.so.6
9# free in /lib/x86_64-linux-gnu/libc.so.6
10# Threads::Coro::Worker_c::~Worker_c() in searchd
11# Threads::Coro::Worker_c::Resume() in searchd
12# Threads::Coro::Worker_c::DoComplete(void*, Threads::details::SchedulerOperation_t*) in searchd
13# Threads::Service_t::do_run_one(CSphScopedLock<CSphMutex>&, Threads::TaskServiceThreadInfo_t&, std::atomic<bool>&) in searchd
14# Threads::Service_t::run(std::atomic<bool>&) in searchd
15# Threads::ThreadPool_c::loop(int) in searchd
16# 0x000055BDE8EA76E7 in searchd
17# ThreadProcWrapper_fn(void*) in searchd
18# 0x00007F8C220B2AC3 in /lib/x86_64-linux-gnu/libc.so.6
19# 0x00007F8C22144850 in /lib/x86_64-linux-gnu/libc.so.6

-------------- backtrace ends here ---------------

I assume the issue is my setup and not a bug, but I'm on my wits end and don't know what else to check.

Manticore Search Version:

6.3.0

tomatolog commented 3 weeks ago

what is your OS \ container?

RyDoRe commented 3 weeks ago

The container runs in a k3s cluster with debian11/12

tomatolog commented 3 weeks ago

could you provide the container along with data that reproduces this crash locally?

RyDoRe commented 3 weeks ago

Unfortunately I'm not able to provide you with that, as this crash only happens with the large dataset we have in production and I'm not allowed and able to forward this kind of data :/

I have a smaller review-setup, also as part of a kubernetes cluster, but that one works. It has but a fraction of the data we use in production.

Is there a visible misconfiguration in the variables I use in the manticore.conf file that you can make out?

sanikolaev commented 2 weeks ago

We discussed this issue on our development call today, and it's not clear how to proceed without the data necessary to reproduce it. Unfortunately, the crash backtrace has not been helpful in this case.

I'm not allowed and able to forward this kind of data :/

@RyDoRe Just FYI, the core team also offers professional services, which includes signing a work agreement with an NDA, etc. If resolving this issue is mission-critical for you, please consider this option.

RyDoRe commented 2 weeks ago

I'm going to talk with my team about it and will inform you on how we want to proceed. Thank you for looking into it!

RyDoRe commented 2 weeks ago

For the sake of completeness, here is the explicit Dockerfile and startup-script:

Dockerfile:

FROM manticoresearch/manticore:6.3.0
ENV TZ=Europe/Berlin
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
RUN apt-get update && apt-get upgrade -y && apt-get install -y  build-essential libmariadbd-dev make autoconf git wget php-cli php-mysqli iputils-ping 

RUN apt-get update && apt install python3-pip -y && pip3 install sphinxapi-py3

USER manticore

COPY scripts/createConfig.php /var/lib/scripts/createConfig.php
COPY dict /var/lib/dict
COPY scripts/delta_indexing.sh /var/lib/scripts/delta_indexing.sh
COPY scripts/full_indexing.sh /var/lib/scripts/full_indexing.sh
COPY scripts/health-check.py /tmp/health/health-check.py

ADD start.sh /

CMD ["/start.sh"]

start.sh:

#!/bin/bash

php /var/lib/scripts/createConfig.php

if [ ! -f /var/lib/manticore/main_table1.spa ]; then
    /var/lib/scripts/full_indexing.sh
fi

searchd -c /etc/manticoresearch/manticore.conf --nodetach --logdebugv
RyDoRe commented 2 weeks ago

I think we were able to narrow down the issue quite a bit...

The issues we are having is related to the number of sources we are trying to use for the indexes. We have 6000+ sources we are trying to address with one index and with that amount we get the issues mentioned above. Limiting the number of sources to e.g. 200 works perfectly fine and with that, rotating after indexing does not kill the searchd process anymore.

I think our way forward is to shard our indexes to have a capped amount of sources and than add more indexes.

tomatolog commented 2 weeks ago

seems strange that you indexing your data well, indextool shows no error after that but daemon crashed on loading that index. I sure if the indexer finished without error and indextool reports no issues then daemon should load index well too

tomatolog commented 2 weeks ago

do your data sources share the same schema?

do you use joins or any other kind of data depends on each other?

RyDoRe commented 2 weeks ago

The sources dataschema are identical. We use joins with the sql_joined_field in some of the indexes querys, but not to tables that are used in other indexes or that depend on each other.

I'm currently trying out sharding the indexes wit max. 1000 sources and also trying out each shard with running the indexers, to see if there is one specific shard that fails.

RyDoRe commented 1 week ago

Just a heads up and my results so far: I now created a setup in which I shard my indexers to each use a maximum number of sources, which i defined to be 1000 currently. In the end I use a distributed table to add the shards back together for the search. This setup works reliably now from what I can see... The main indexers and also the delta indexers run trough without a problem.

Interesting observations: We currently have 5 shards and a little below 5000 sources Before the sharding, so per index around that number of sources, we had a memory usage up to and around 28-32 GiB. With the sharding the memory usage of the indexing process now only goes upo to 12 GiB.