manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
9.02k stars 506 forks source link

Indexer crash #2460

Closed KarPaLex98 closed 2 months ago

KarPaLex98 commented 3 months ago

Bug Description:

container_name  | Manticore 6.3.2 c296dc7c8@24062606
container_name  | Copyright (c) 2001-2016, Andrew Aksyonoff
container_name  | Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
container_name  | Copyright (c) 2017-2024, Manticore Software LTD (https://manticoresearch.com)
container_name  |
container_name  | using config file '/etc/manticoresearch/manticore.conf'...
container_name  | indexing table 'products'...
container_name  | *** Oops, indexer crashed! Please send the following report to developers.
container_name  | Manticore 6.3.2 c296dc7c8@24062606
container_name  | -------------- report begins here ---------------
container_name  | Current document: docid=0, hits=0
container_name  | Current batch: minid=0, maxid=0
container_name  | Hit pool start: docid=0, hit=0
container_name  | -------------- backtrace begins here ---------------
container_name  | Program compiled with Clang 16.0.6
container_name  | Configured with flags: Configured with these definitions: -DDISTR_BUILD=jammy -DUSE_SYSLOG=1 -DWITH_GALERA=1 -DWITH_RE2=1 -DWITH_RE2_FORCE_STATIC
=1 -DWITH_STEMMER=1 -DWITH_STEMMER_FORCE_STATIC=1 -DWITH_NLJSON=1 -DWITH_UNIALGO=1 -DWITH_ICU=1 -DWITH_ICU_FORCE_STATIC=1 -DWITH_SSL=1 -DWITH_ZLIB=1 -DWITH_ZSTD=1 -DDL_ZSTD=1
-DZSTD_LIB=libzstd.so.1 -DWITH_CURL=1 -DDL_CURL=1 -DCURL_LIB=libcurl.so.4 -DWITH_ODBC=1 -DDL_ODBC=1 -DODBC_LIB=libodbc.so.2 -DWITH_EXPAT=1 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so
.1 -DWITH_ICONV=1 -DWITH_MYSQL=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.21 -DWITH_POSTGRESQL=1 -DDL_POSTGRESQL=1 -DPOSTGRESQL_LIB=libpq.so.5 -DLOCALDATADIR=/var/lib/mantic
ore -DFULL_SHARE_DIR=/usr/share/manticore
container_name  | Built on Linux x86_64 (jammy) (cross-compiled)
container_name  | Stack bottom = 0x0, thread stack size = 0x20000
container_name  | Trying system backtrace:
container_name  | begin of system symbols:
container_name  | indexer(_Z12sphBacktraceib+0x227)[0x55a746f09257]
container_name  | indexer(_Z7sigsegvi+0xbb)[0x55a746dfca8b]
container_name  | /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f1a4236d520]
container_name  | indexer(+0xcbc489)[0x55a746e1e489]
container_name  | indexer(_ZN13CSphIndex_VLN5BuildERKN3sph8Vector_TIP10CSphSourceNS0_13DefaultCopy_TIS3_EENS0_14DefaultRelimitENS0_16DefaultStorage_TIS3_EEEEiiR17C
SphIndexProgress+0xff2)[0x55a746e1b512]
container_name  | indexer(_Z7DoIndexRK17CSphConfigSectionPKcRK15CSphOrderedHashIS_10CSphString15CSphStrHashFuncLi256EEP8_IO_FILE+0x1d40)[0x55a746dfa880]
container_name  | indexer(main+0x2a1a)[0x55a746dffb9a]
container_name  | /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f1a42354d90]
container_name  | /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f1a42354e40]
container_name  | indexer(_start+0x25)[0x55a746df1935]
container_name  | Trying boost backtrace:
container_name  |  0# sphBacktrace(int, bool) in indexer
container_name  |  1# sigsegv(int) in indexer
container_name  |  2# 0x00007F1A4236D520 in /lib/x86_64-linux-gnu/libc.so.6
container_name  |  3# 0x000055A746E1E489 in indexer
container_name  |  4# CSphIndex_VLN::Build(sph::Vector_T<CSphSource*, sph::DefaultCopy_T<CSphSource*>, sph::DefaultRelimit, sph::DefaultStorage_T<CSphSource*> > co
nst&, int, int, CSphIndexProgress&) in indexer
container_name  |  5# DoIndex(CSphConfigSection const&, char const*, CSphOrderedHash<CSphConfigSection, CSphString, CSphStrHashFunc, 256> const&, _IO_FILE*) in ind
exer
container_name  |  6# main in indexer
container_name  |  7# 0x00007F1A42354D90 in /lib/x86_64-linux-gnu/libc.so.6
container_name  |  8# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
container_name  |  9# _start in indexer
container_name  |
container_name  | -------------- backtrace ends here ---------------
container_name  | Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues)
container_name  | and attach there:
container_name  | a) searchd log, b) searchd binary, c) searchd symbols.
container_name  | Look into the chapter 'Reporting bugs' in the manual
container_name  | (https://manual.manticoresearch.com/Reporting_bugs)
container_name  | Will run gdb on '/usr/bin/indexer', pid '12'
container_name exited with code 0

xml-files-for-index.zip

manticore.conf

source products
{
    type = xmlpipe2
    xmlpipe_command = cat /var/lib/manticore/data/xml/products.xml
}
index products
{
    source = products
    path = /var/lib/manticore/data/index/product/
    #morphology = stem_en, stem_ru, soundex, metaphone
    morphology = lemmatize_en, lemmatize_ru, soundex, metaphone
    min_word_len = 1
    stopwords = ru
    wordforms = /var/lib/manticore/data/words/wordforms.txt
    html_strip = 1
    min_prefix_len = 4
    min_infix_len = 4
    expand_keywords = 1
    index_exact_words = 1
    charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+0435, U+451->U+0435
    blend_chars = U+2D
    bigram_index = all
    dict = keywords
}
source categories
{
    type = xmlpipe2
    xmlpipe_command = cat /var/lib/manticore/data/xml/categories.xml
}
index categories
{
    source = categories
    path = /var/lib/manticore/data/index/category/
    #morphology = stem_en, stem_ru, soundex, metaphone
    morphology = lemmatize_en, lemmatize_ru, soundex, metaphone
    min_word_len = 1
    stopwords = ru
    wordforms = /var/lib/manticore/data/words/wordforms.txt
    html_strip = 1
    min_prefix_len = 2
    min_infix_len = 2
    expand_keywords = 1
    index_exact_words = 1
    charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+0435, U+451->U+0435
    blend_chars = U+2D
}
indexer
{
    mem_limit = 512M
}
searchd
{
    buddy_path = #
    auto_schema = 0
    secondary_indexes = 0
    listen = 9312
    listen = 9306:mysql
    log = /var/lib/manticore/data/log/searchd.log
    query_log_mode = 666
    query_log = /var/lib/manticore/data/log/query.log
    network_timeout = 10
    pid_file = /var/lib/manticore/data/log/searchd.pid
    binlog_path = /var/lib/manticore/data/binlog
    binlog_flush = 2
    binlog_max_log_size = 512M
    seamless_rotate = 1
    preopen_tables = 1
    unlink_old = 1
    query_log_format = sphinxql
    pseudo_sharding = 0
    subtree_docs_cache = 128M
    subtree_hits_cache = 128M
}
common
{
    lemmatizer_base = /var/lib/manticore/data/dicts
}

docker compose file

version: '2.2'

services:
  manticore:
    container_name: container_name
    restart: unless-stopped
    image: manticoresearch/manticore:6.3.2
    ports:
      - "127.0.0.1:9315:9306" # bind to local interface only!
    ulimits:
      nproc: 65535
      nofile:
        soft: 65535
        hard: 65535
      memlock:
        soft: -1
        hard: -1
    volumes:
      - ./data:/var/lib/manticore/data:rw
      - ./manticore.conf:/etc/manticoresearch/manticore.conf
    command: >
      sh -c "chown -R 999:999 /var/lib/manticore/data && gosu manticore indexer --all --config /etc/manticoresearch/manticore.conf && gosu manticore searchd --config /etc/manticoresearch/manticore.conf --nodetach && gosu manticore indexer --rotate --all --config /etc/manticoresearch/manticore.conf"

Бинлогов и вообще как таковых логов нет, есть только вот такие файлы индекса изображение

Manticore Search Version:

6.3.2

Operating System Version:

Debian GNU/Linux 11 (bullseye)

Have you tried the latest development version?

No

Internal Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

- [x] Implementation completed - [x] Tests developed - [x] Documentation updated - [x] Documentation reviewed - [x] Changelog updated
KarPaLex98 commented 3 months ago

Разобрался, если один и тот же атрибут объявлен с разными типами (например, string karkas, а потом bigint karkas, то падает. Убрал это и перестало крашиться

sanikolaev commented 3 months ago

Thanks @KarPaLex98

MRE

cat << 'EOF' > xml
<?xml version="1.0" encoding="utf-8"?>
<sphinx:docset xmlns:sphinx="http://sphinxsearch.com/">
<sphinx:schema>
    <sphinx:attr name="a" type="int" />
    <sphinx:attr name="a" type="string" />
</sphinx:schema>
<sphinx:document id="10">
    <a>1</a>
    <a>a</a>
</sphinx:document>
</sphinx:docset>
EOF

cat << 'EOF' > xml_crash.conf
source min {
  type = xmlpipe2
  xmlpipe_command = cat xml
}

index idx {
  path = idx
  source = min
}
EOF

snikolaev@dev2:~$ indexer -c xml_crash.conf --all
Manticore 6.3.3 f3dab0eba@24072313 dev (columnar 2.3.1 42f2b06@24070110) (secondary 2.3.1 42f2b06@24070110) (knn 2.3.1 42f2b06@24070110)

*** Oops, indexer crashed! Please send the following report to developers.
Manticore 6.3.3 f3dab0eba@24072313 dev (columnar 2.3.1 42f2b06@24070110) (secondary 2.3.1 42f2b06@24070110) (knn 2.3.1 42f2b06@24070110)
-------------- report begins here ---------------
Current document: docid=0, hits=0
Current batch: minid=0, maxid=0
Hit pool start: docid=0, hit=0
-------------- backtrace begins here ---------------
Program compiled with Clang 16.0.6
Configured with flags: Configured with these definitions: -DDISTR_BUILD=jammy -DUSE_SYSLOG=1 -DWITH_GALERA=1 -DWITH_RE2=1 -DWITH_RE2_FORCE_STATIC=1 -DWITH_STEMMER=1 -DWITH_STEMMER_FORCE_STATIC=1 -DWITH_NLJSON=1 -DWITH_UNIALGO=1 -DWITH_ICU=1 -DWITH_ICU_FORCE_STATIC=1 -DWITH_SSL=1 -DWITH_ZLIB=1 -DDL_ZLIB=1 -DZLIB_LIB=libz.so.1 -DWITH_ZSTD=1 -DDL_ZSTD=1 -DZSTD_LIB=libzstd.so.1 -DWITH_CURL=1 -DDL_CURL=1 -DCURL_LIB=libcurl.so.4 -DWITH_ODBC=1 -DDL_ODBC=1 -DODBC_LIB=libodbc.so.2 -DWITH_EXPAT=1 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DWITH_ICONV=1 -DWITH_MYSQL=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.21 -DWITH_POSTGRESQL=1 -DDL_POSTGRESQL=1 -DPOSTGRESQL_LIB=libpq.so.5 -DLOCALDATADIR=/var/lib/manticore -DFULL_SHARE_DIR=/usr/share/manticore
Built on Linux x86_64 (jammy) (cross-compiled)
Stack bottom = 0x0, thread stack size = 0x20000
Trying system backtrace:
begin of system symbols:
indexer(_Z12sphBacktraceib+0x227)[0x558b1110e807]
indexer(_Z7sigsegvi+0xbb)[0x558b11001eeb]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fc8da48f520]
indexer(+0xcd3c61)[0x558b11023c61]
indexer(_ZN13CSphIndex_VLN5BuildERKN3sph8Vector_TIP10CSphSourceNS0_13DefaultCopy_TIS3_EENS0_14DefaultRelimitENS0_16DefaultStorage_TIS3_EEEEiiR17CSphIndexProgress+0x100d)[0x558b11020d0d]
indexer(_Z7DoIndexRK17CSphConfigSectionPKcRK15CSphOrderedHashIS_10CSphString15CSphStrHashFuncLi256EEP8_IO_FILE+0x1d60)[0x558b10fffce0]
indexer(main+0x2a1a)[0x558b11004ffa]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fc8da476d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fc8da476e40]
indexer(_start+0x25)[0x558b10ff6d75]
Trying boost backtrace:
 0# sphBacktrace(int, bool) in indexer
 1# sigsegv(int) in indexer
 2# 0x00007FC8DA48F520 in /lib/x86_64-linux-gnu/libc.so.6
 3# 0x0000558B11023C61 in indexer
 4# CSphIndex_VLN::Build(sph::Vector_T<CSphSource*, sph::DefaultCopy_T<CSphSource*>, sph::DefaultRelimit, sph::DefaultStorage_T<CSphSource*> > const&, int, int, CSphIndexProgress&) in indexer
 5# DoIndex(CSphConfigSection const&, char const*, CSphOrderedHash<CSphConfigSection, CSphString, CSphStrHashFunc, 256> const&, _IO_FILE*) in indexer
 6# main in indexer
 7# 0x00007FC8DA476D90 in /lib/x86_64-linux-gnu/libc.so.6
 8# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
 9# _start in indexer

-------------- backtrace ends here ---------------
tomatolog commented 2 months ago

crash of indexer was fixed at https://github.com/manticoresoftware/manticoresearch/commit/eb1d30a3f8ce956d3b3efcbc58b3cedcd2efc95a now indexer exits with error message in case of duplicate attributes or fields declared

You need to use indexer from the dev package repository to get the crash fixed

tomatolog commented 2 months ago

@PavelShilin89 could you create clt test with the two cases:

cat << 'EOF' > xml_crash.conf source min { type = xmlpipe2 xmlpipe_command = cat xml }

index idx { path = idx source = min } EOF



for every config posted indexer should exits with the error message but does not crash
PavelShilin89 commented 2 months ago

Done in https://github.com/manticoresoftware/manticoresearch/pull/2536