manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
8.97k stars 498 forks source link

Indexer Crashes #1480

Closed kalafaye closed 1 year ago

kalafaye commented 1 year ago

Describe the bug Just tried to run indexer.. got the error message at the bottom.

To Reproduce Steps to reproduce the behavior:

  1. run indexer

Expected behavior A clear and concise description of what you expected to happen.

Describe the environment:

Messages from log files: Messages from searchd.log and query.log (if applicable).

Additional context ** Oops, indexer crashed! Please send the following report to developers. Manticore 6.2.12 dc5144d35@230822 (columnar 2.2.4 5aec342@230822) (secondary 2.2.4 5aec342@230822) -------------- report begins here --------------- Current document: docid=0, hits=0 Current batch: minid=0, maxid=0 Hit pool start: docid=0, hit=0 -------------- backtrace begins here --------------- Program compiled with Clang 15.0.7 Configured with flags: Configured with these definitions: -DDISTR_BUILD=bionic -DUSE_SYSLOG=1 -DWITH_GALERA=1 -DWITH_RE2=1 -DWITH_RE2_FORCE_STATIC=1 -DWITH_STEMMER=1 -DWITH_STEMMER_FORCE_STATIC=1 -DWITH_NLJSON=1 -DWITH_UNIALGO=1 -DWITH_ICU=1 -DWITH_ICU_FORCE_STATIC=1 -DWITH_SSL=1 -DWITH_ZLIB=1 -DWITH_ZSTD=1 -DDL_ZSTD=1 -DZSTD_LIB=libzstd.so.1 -DWITH_CURL=1 -DDL_CURL=1 -DCURL_LIB=libcurl.so.4 -DWITH_ODBC=1 -DDL_ODBC=1 -DODBC_LIB=libodbc.so.2 -DWITH_EXPAT=1 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DWITH_ICONV=1 -DWITH_MYSQL=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmysqlclient.so.20 -DWITH_POSTGRESQL=1 -DDL_POSTGRESQL=1 -DPOSTGRESQL_LIB=libpq.so.5 -DLOCALDATADIR=/var/lib/manticore -DFULL_SHARE_DIR=/usr/share/manticore Built on Linux x86_64 (bionic) (cross-compiled) Stack bottom = 0x0, thread stack size = 0x20000 Trying system backtrace: begin of system symbols: indexer(_Z12sphBacktraceib+0x22a)[0x55ad9abd5d0a] indexer(_Z7sigsegvi+0xbb)[0x55ad9aad11bb] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12980)[0x7fcabb896980] /lib/x86_64-linux-gnu/libc.so.6(+0xb1306)[0x7fcabb544306] indexer(_ZN3sph10vSprintf_TI15StringBuilder_cEEvPT_PKcP13va_list_tag+0x914)[0x55ad9abdc434] indexer(_ZN15StringBuilder_c8vSprintfEPKcP13va_list_tag+0xee)[0x55ad9aada79e] indexer(_ZN6TlsMsg3ErrEPKcz+0xb0)[0x55ad9abd0890] indexer(_ZN16CSphConfigParser5ParseEv+0x1046)[0x55ad9abd2e76] indexer(_Z11ParseConfigP15CSphOrderedHashIS_I17CSphConfigSection10CSphString15CSphStrHashFuncLi256EES1_S2_Li256EES1_RK11VecTraits_TIcE+0x6a)[0x55ad9abd34ea] indexer(+0xda6866)[0x55ad9abd3866] indexer(Z13sphLoadConfigRK10CSphStringbRS+0x9)[0x55ad9abd3749] indexer(main+0xed7)[0x55ad9aad2787] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7fcabb4b4c87] indexer(_start+0x2a)[0x55ad9aac5b0a] Trying boost backtrace: 0# sphBacktrace(int, bool) in indexer 1# sigsegv(int) in indexer 2# 0x00007FCABB896980 in /lib/x86_64-linux-gnu/libpthread.so.0 3# 0x00007FCABB544306 in /lib/x86_64-linux-gnu/libc.so.6 4# void sph::vSprintf_T(StringBuilder_c, char const, __va_list_tag) in indexer 5# StringBuilder_c::vSprintf(char const, __va_list_tag) in indexer 6# TlsMsg::Err(char const, ...) in indexer 7# CSphConfigParser::Parse() in indexer 8# ParseConfig(CSphOrderedHash<CSphOrderedHash<CSphConfigSection, CSphString, CSphStrHashFunc, 256>, CSphString, CSphStrHashFunc, 256>, CSphString, VecTraits_T const&) in indexer 9# 0x000055AD9ABD3866 in indexer 10# sphLoadConfig(CSphString const&, bool, CSphString&) in indexer 11# main in indexer 12# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6 13# _start in indexer

-------------- backtrace ends here --------------- Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues) and attach there: a) searchd log, b) searchd binary, c) searchd symbols. Look into the chapter 'Reporting bugs' in the manual (https://manual.manticoresearch.com/Reporting_bugs) Will run gdb on '/usr/bin/indexer', pid '31523' [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007fcabb8962f4 in __waitpid (pid=31568, stat_loc=0x7ffcea0d7c54, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:30 30 ../sysdeps/unix/sysv/linux/waitpid.c: No such file or directory. Id Target Id Frame

Thread 1 (Thread 0x7fcabcc1efc0 (LWP 31523)):

0 0x00007fcabb8962f4 in __waitpid (pid=31568, stat_loc=0x7ffcea0d7c54, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:30

1 0x000055ad9abd55de in sphDumpGdb(int, char const, char const) ()

2 0x000055ad9abd5ea1 in sphBacktrace(int, bool) ()

3 0x000055ad9aad11bb in sigsegv(int) ()

4

5 __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:120

6 0x000055ad9abdc434 in void sph::vSprintf_T(StringBuilder_c, char const, __va_list_tag*) ()

7 0x000055ad9aada79e in StringBuilder_c::vSprintf(char const, __va_list_tag) ()

8 0x000055ad9abd0890 in TlsMsg::Err(char const*, ...) ()

9 0x000055ad9abd2e76 in CSphConfigParser::Parse() ()

10 0x000055ad9abd34ea in ParseConfig(CSphOrderedHash<CSphOrderedHash<CSphConfigSection, CSphString, CSphStrHashFunc, 256>, CSphString, CSphStrHashFunc, 256>*, CSphString, VecTraits_T const&) ()

11 0x000055ad9abd3866 in ?? ()

12 0x000055ad9abd3749 in sphLoadConfig(CSphString const&, bool, CSphString&) ()

13 0x000055ad9aad2787 in main ()

Main thread:

0 0x00007fcabb8962f4 in __waitpid (pid=31568, stat_loc=0x7ffcea0d7c54, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:30

1 0x000055ad9abd55de in sphDumpGdb(int, char const, char const) ()

2 0x000055ad9abd5ea1 in sphBacktrace(int, bool) ()

3 0x000055ad9aad11bb in sigsegv(int) ()

4

5 __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:120

6 0x000055ad9abdc434 in void sph::vSprintf_T(StringBuilder_c, char const, __va_list_tag*) ()

7 0x000055ad9aada79e in StringBuilder_c::vSprintf(char const, __va_list_tag) ()

8 0x000055ad9abd0890 in TlsMsg::Err(char const*, ...) ()

9 0x000055ad9abd2e76 in CSphConfigParser::Parse() ()

10 0x000055ad9abd34ea in ParseConfig(CSphOrderedHash<CSphOrderedHash<CSphConfigSection, CSphString, CSphStrHashFunc, 256>, CSphString, CSphStrHashFunc, 256>*, CSphString, VecTraits_T const&) ()

11 0x000055ad9abd3866 in ?? ()

12 0x000055ad9abd3749 in sphLoadConfig(CSphString const&, bool, CSphString&) ()

13 0x000055ad9aad2787 in main ()

Local variables: resultvar = 18446744073709551104 sc_ret =

sanikolaev commented 1 year ago

Hi. Is there a way to reproduce it?

tomatolog commented 1 year ago

the crash is on loading the config

could you provide your config to reproduce this crash locally?

kalafaye commented 1 year ago

here is config: #############################################################################

data source definition

#############################################################################

source main_src_foo {

        type                    = mysql
        sql_host                = devMySQL04
        sql_user                = dbadmin
        sql_pass                = @@@@@@@
        sql_db                  = foo
        sql_port                = 3306  # optional, default is 3306
        sql_sock                = /db/mysql/data/mysql.sock

}

#############################################################################

SOURCE DEFINITION - using mainsrc[table_name]

#############################################################################

source src_doo_business_search {

            type                      =     tsvpipe
            tsvpipe_command           =     zcat /db/manticore/doo/doo_business_search.gz          
            tsvpipe_attr_bigint       =     poseidonid 
            tsvpipe_field_string       =     type       
            tsvpipe_field_string      =     states
            tsvpipe_field_string      =     cities
            tsvpipe_field_string      =     business_names

}

#############################################################################

index definition

#############################################################################

table idx_doo_business_search { source = src_doo_business_search path = /db/manticore/doo/idx_doo_business_search dict = keywords morphology = none min_word_len = 1 min_prefix_len = 3 html_strip = 0 stopwords = /etc/manticoresearch/stop_words.txt
wordforms = /etc/manticoresearch/wordforms.txt

morphology = lemmatize_en_all

            #index_exact_words  = 1
            #min_stemming_len   = 4
  }

source src_moo_business_search { type = tsvpipe tsvpipe_command = zcat /db/manticore/moo/moo.gz

    tsvpipe_attr_bigint  =  poseidonid
    tsvpipe_attr_bigint  =  hashsource
    tsvpipe_field_string =  businessname
    tsvpipe_attr_bigint  =  hash_city
    tsvpipe_attr_bigint  =  stateid
    tsvpipe_attr_bigint  =  hash_housenumber
    tsvpipe_attr_bigint  =  hash_streetname
    tsvpipe_attr_bigint  =  hash_unitnumber
    tsvpipe_attr_bigint  =  hash_zipcode
    tsvpipe_attr_bigint  =  hash_county

}

#############################################################################

index definition

#############################################################################

table idx_moo_business_search { type = plain dict = keywords morphology = none stopwords = /etc/manticoresearch/stop_words.txt #removed en source = src_moo_business_search path = /db/manticore/moo/idx_moo_business_search wordforms = /etc/manticoresearch/wordforms.txt min_prefix_len = 3

} #############################################################################

SOURCE DEFINITION - poo

#############################################################################

source src_poo_search_full {

            type                      =     tsvpipe
            tsvpipe_command           =     zcat /db/manticore/poo/poo_business.gz          

            tsvpipe_attr_bigint       =     poseidenid  
            tsvpipe_field_string      =     business_name
            tsvpipe_field_string        =   house_number                 
            tsvpipe_field_string      =     street_name
            tsvpipe_field_string      =     unit_number
            tsvpipe_field_string      =     zip_code
            tsvpipe_field_string      =     city  
            tsvpipe_field_string      =     state  

}

source src_poo_search_full_test:src_poo_search_full { tsvpipe_command = zcat /db/manticore/poo_test/poo_search_full.gz }

#############################################################################

index definition

#############################################################################

table idx_poo_search_full {

            source          = src_poo_search_full
            path            = /db/manticore/poo/idx_poo_search_full
            dict            = keywords
            morphology      = none
            min_word_len    = 1
            min_prefix_len  = 3
            html_strip      = 0
            stopwords     = /etc/manticoresearch/stop_words.txt
        blend_chars     = &, @, U+23, U+2D, U+2E, U+AD
blend_mode      = trim_head, trim_tail
ignore_chars    = U+27
charset_table   = 0..9, english

wordforms = /etc/manticoresearch/wordforms.txt

  }

index idx_poo_search_full_test:idx_poo_search_full { source = src_poo_search_full_test path = /db/manticore/poo_test/poo_search_full }

#############################################################################

SOURCE DEFINITION - using mainsrc[table_name]

#############################################################################

source src_foo_search_full {

            type                      =     tsvpipe
            tsvpipe_command           =     zcat /db/manticore/foo/foo_search_full.gz          

            tsvpipe_attr_bigint       =     foo_id     
            tsvpipe_field_string      =     first_name
            tsvpipe_field_string      =     middle_name
            tsvpipe_field_string      =     last_name
            tsvpipe_field_string      =     current_job_title  
            tsvpipe_field_string      =     past_job_title
            tsvpipe_field_string      =     current_level
            tsvpipe_field_string      =     past_level
            tsvpipe_field_string      =     current_department
            tsvpipe_field_string      =     past_department
            tsvpipe_field_string      =     current_business_name
            tsvpipe_field_string      =     past_business_name
            tsvpipe_field_string      =     current_industry
            tsvpipe_field_string      =     past_industry
            tsvpipe_field_string      =     skill
            tsvpipe_field_string      =     current_street_address
            tsvpipe_field_string      =     past_street_address
            tsvpipe_field_string      =     current_city
            tsvpipe_field_string      =     past_city
            tsvpipe_field_string      =     current_state
            tsvpipe_field_string      =     past_state
            tsvpipe_field_string      =     current_zip
            tsvpipe_field_string      =     past_zip
            tsvpipe_field_string      =     p_city
            tsvpipe_field_string      =     p_state
            tsvpipe_attr_uint         =     levelrank
            tsvpipe_attr_float        =     contactscore 
            tsvpipe_attr_bool         =     has_contact 
            tsvpipe_field_string      =     phone        
            tsvpipe_attr_bool         =     has_phone
            tsvpipe_field_string      =     email        
            tsvpipe_attr_bool         =     has_email
            tsvpipe_field_string      =     linkedin        
            tsvpipe_attr_bool         =     has_linkedin

}

source src_foo_search_full_test:src_foo_search_full

{

tsvpipe_command = zcat /db/manticore/foo_dedupe/foo_dedupe.gz

}

#############################################################################

index definition

#############################################################################

index idx_foo_search_full {

            source              = src_foo_search_full
            path                = /db/manticore/foo/idx_foo_search_full
            dict                = keywords
            morphology          = none
            min_word_len        = 1
            min_prefix_len      = 3
            html_strip          = 0
            stopwords          = /etc/manticoresearch/stop_words.txt             
            wordforms           = /etc/manticoresearch/wordforms.txt
            #morphology         = lemmatize_en_all
            #index_exact_words  = 1
            #min_stemming_len   = 4

  }

index idx_foo_search_full_test:idx_foo_search_full

{

source = src_foo_search_full_test

path = /db/manticore/foo_dedupe/idx_foo_search_full

}

#############################################################################

SOURCE DEFINITION - using mainsrc[table_name]

#############################################################################

source src_foo_seo {

            type                      =     tsvpipe
            tsvpipe_command           =     zcat /db/manticore/foo/foo_seo.gz          

            tsvpipe_field_string      =     lnameinitial
            tsvpipe_field_string      =     lastname
            tsvpipe_attr_string       =     firstname
            tsvpipe_field_string      =     location
            tsvpipe_field_string      =     state
            tsvpipe_attr_uint         =     popularity
            tsvpipe_field_string      =     searchtype

}

index idx_foo_seo {

            source              = src_foo_seo
            path                = /db/manticore/foo/idx_foo_seo
            dict                = keywords
            morphology          = none
    min_prefix_len      = 1
            html_strip          = 0

  }

#############################################################################

SOURCE DEFINITION - using mainsrc[table_name]

#############################################################################

source src_foo_seo_city {

            type                      =     tsvpipe
            tsvpipe_command           =     zcat /db/manticore/foo/foo_seo_city.gz          

            tsvpipe_field_string      =     lnameinitial
            tsvpipe_field_string      =     lastname
            tsvpipe_attr_string       =     firstname
            tsvpipe_attr_string       =     company
            tsvpipe_attr_string       =     title
            tsvpipe_field_string      =     location
            tsvpipe_field_string      =     state
    tsvpipe_attr_bigint       =     poseidonid
            tsvpipe_field_string      =     searchtype

}

index idx_foo_seo_city {

            source              = src_foo_seo_city
            path                = /db/manticore/foo/idx_foo_seo_city
            dict                = keywords
            morphology          = none
    min_prefix_len      = 1
            html_strip          = 0

  }

#############################################################################

SOURCE DEFINITION - using mainsrc[table_name]

#############################################################################

source src_foo_seo_first {

            type                      =     tsvpipe
            tsvpipe_command           =     zcat /db/manticore/foo/foo_seo_first.gz          

            tsvpipe_field_string      =     lnameinitial
            tsvpipe_field_string      =     lastname
            tsvpipe_attr_string       =     firstname
            tsvpipe_attr_string       =     company
            tsvpipe_attr_string       =     title
            tsvpipe_field_string      =     location
            tsvpipe_field_string      =     state
            tsvpipe_attr_uint         =     popularity
    tsvpipe_attr_bigint       =     poseidonid
            tsvpipe_field_string      =     searchtype

}

table idx_foo_seo_first {

            source              = src_foo_seo_first
            path                = /db/manticore/foo/idx_foo_seo_first
            dict                = keywords
            morphology          = none
    min_prefix_len      = 1
            html_strip          = 0

  }

#############################################################################

SOURCE DEFINITION - using mainsrc[table_name]

#############################################################################

source src_foo_seo_state {

            type                      =     tsvpipe
            tsvpipe_command           =     zcat /db/manticore/foo/foo_seo_state.gz          

            tsvpipe_field_string      =     lnameinitial
            tsvpipe_field_string      =     lastname
            tsvpipe_attr_string       =     firstname
            tsvpipe_attr_string       =     company
            tsvpipe_attr_string       =     title
            tsvpipe_field_string      =     location
            tsvpipe_field_string      =     state
            tsvpipe_attr_uint         =     popularity
    tsvpipe_attr_bigint       =     poseidonid
            tsvpipe_field_string      =     searchtype

}

table idx_foo_seo_state {

            source              = src_foo_seo_state
            path                = /db/manticore/foo/idx_foo_seo_state
            dict                = keywords
            morphology          = none
    min_prefix_len      = 1
            html_strip          = 0

  }

#############################################################################

SOURCE DEFINITION - using mainsrc[table_name]

#############################################################################

source src_foo_test_seo {

            type                      =     tsvpipe
            tsvpipe_command           =     zcat /db/manticore/foo/foo_test_seo.gz          

    tsvpipe_field_string      =     hashval
            tsvpipe_attr_string       =     lnameinitial
            tsvpipe_attr_string       =     lastname
            tsvpipe_attr_string       =     firstname
            tsvpipe_attr_string       =     company
            tsvpipe_attr_string       =     title
            tsvpipe_attr_string       =     location
            tsvpipe_attr_string       =     state
            tsvpipe_attr_uint         =     popularity
            tsvpipe_attr_string       =     searchtype

}

table idx_foo_test_seo {

            source              = src_foo_test_seo
            path                = /db/manticore/foo/idx_foo_test_seo
            dict                = keywords
            morphology          = none
    min_prefix_len      = 1
            html_strip          = 0

  }

#############################################################################

indexer settings

#############################################################################

indexer { mem_limit = 2047M write_buffer = 1G }

#############################################################################

searchd settings

#############################################################################

searchd { listen = 9312:sphinx listen = 9306:mysql41 listen = 9307:mysql_vip listen = 9308:http

listen                                                  = 127.0.0.1:9312:sphinx
listen                                                  = 127.0.0.1:9306:mysql41
listen                                                  = 127.0.0.1:9307:mysql_vip
listen                                                  = 127.0.0.1:9308:http

log                         = /var/log/manticore/searchd.log
query_log                       = /var/log/manticore/query.log
    query_log_mode                                  = 644
client_timeout                  = 500
pid_file                    = /var/run/manticore/searchd.pid
unlink_old                  = 1
max_packet_size                 = 50M
max_filters                 = 256
max_filter_values               = 4096
max_batch_queries               = 32
agent_query_timeout                             = 30000 

access_plain_attrs = mmap_preread binlog_path = docstore_cache_size = 512M net_workers = 2 qcache_max_bytes = 67108864 read_buffer_docs = 512K read_buffer_hits = 512K

threads = 40

    pseudo_sharding                                 = 1
    query_log_format                                = sphinxql

mysql_version_string = 5.7.1 not_terms_only_allowed = 1 shutdown_timeout = 5 agent_retry_count = 2 }

index idx_foo_search_full_20230908:idx_foo_search_full { path = /db/manticore/foo/20230908/idx_foo_search_full }

index foo_seo_city_20230908:foo_seo_city { path = /db/manticore/foo/20230908/foo_seo_city }

index foo_seo_first_20230908:foo_seo_first { path = /db/manticore/foo/20230908/foo_seo_first }

index foo_seo_20230908:foo_seo { path = /db/manticore/foo/20230908/foo_seo }

index foo_seo_opts_20230908:foo_seo_opts { path = /db/manticore/foo/20230908/foo_seo_opts }

index foo_seo_state_20230908:foo_seo_state { path = /db/manticore/foo/20230908/foo_seo_state }

kalafaye commented 1 year ago

So basically I didn't do anything new... went to rotate the index like normal.. ( it SEEMS to be because of the dynamic onfig.. but that's been working too.. so maybe a red herring) it gave the segmentation rail core dump message again after trying it again and then I was going to cycle through a stop then start ( service manticore stop/start)/ then the long error message. and now it wont start.

tomatolog commented 1 year ago

could you attach your config as a file to make sure github parser does not break content trying to format it?

sanikolaev commented 1 year ago

The escaped config is:

#############################################################################
## data source definition
#############################################################################

source main_src_foo
    {

            type                    = mysql
            sql_host                = devMySQL04
            sql_user                = dbadmin
            sql_pass                = @@@@@@@
            sql_db                  = foo
            sql_port                = 3306  # optional, default is 3306
            sql_sock                = /db/mysql/data/mysql.sock

    }

#############################################################################
## SOURCE DEFINITION - using main_src_[table_name]
#############################################################################

source src_doo_business_search
{

                type                      =     tsvpipe
                tsvpipe_command           =     zcat /db/manticore/doo/doo_business_search.gz          
                tsvpipe_attr_bigint       =     poseidonid 
                tsvpipe_field_string       =     type       
                tsvpipe_field_string      =     states
                tsvpipe_field_string      =     cities
                tsvpipe_field_string      =     business_names
}

#############################################################################
## index definition
#############################################################################

table idx_doo_business_search
      {
                source              = src_doo_business_search
                path                = /db/manticore/doo/idx_doo_business_search
                dict                = keywords
                morphology          = none
                min_word_len        = 1
                min_prefix_len      = 3
                html_strip          = 0
                stopwords          = /etc/manticoresearch/stop_words.txt             
                wordforms           = /etc/manticoresearch/wordforms.txt
                #morphology         = lemmatize_en_all
                #index_exact_words  = 1
                #min_stemming_len   = 4
      }
source src_moo_business_search
{
        type                 =     tsvpipe
        tsvpipe_command      =     zcat /db/manticore/moo/moo.gz

        tsvpipe_attr_bigint  =  poseidonid
        tsvpipe_attr_bigint  =  hashsource
        tsvpipe_field_string =  businessname
        tsvpipe_attr_bigint  =  hash_city
        tsvpipe_attr_bigint  =  stateid
        tsvpipe_attr_bigint  =  hash_housenumber
        tsvpipe_attr_bigint  =  hash_streetname
        tsvpipe_attr_bigint  =  hash_unitnumber
        tsvpipe_attr_bigint  =  hash_zipcode
        tsvpipe_attr_bigint  =  hash_county
}

#############################################################################
## index definition
#############################################################################

table idx_moo_business_search
{
    type            = plain
    dict            = keywords
    morphology      = none
    stopwords       = /etc/manticoresearch/stop_words.txt #removed en
    source          = src_moo_business_search
    path            = /db/manticore/moo/idx_moo_business_search
    wordforms       = /etc/manticoresearch/wordforms.txt
    min_prefix_len  = 3

}
#############################################################################
## SOURCE DEFINITION - poo
#############################################################################

source src_poo_search_full
{

                type                      =     tsvpipe
                tsvpipe_command           =     zcat /db/manticore/poo/poo_business.gz          

                tsvpipe_attr_bigint       =     poseidenid  
                tsvpipe_field_string      =     business_name
                tsvpipe_field_string        =   house_number                 
                tsvpipe_field_string      =     street_name
                tsvpipe_field_string      =     unit_number
                tsvpipe_field_string      =     zip_code
                tsvpipe_field_string      =     city  
                tsvpipe_field_string      =     state  

}

source src_poo_search_full_test:src_poo_search_full
{
      tsvpipe_command   =     zcat /db/manticore/poo_test/poo_search_full.gz
}

#############################################################################
## index definition
#############################################################################

table idx_poo_search_full
      {

                source          = src_poo_search_full
                path            = /db/manticore/poo/idx_poo_search_full
                dict            = keywords
                morphology      = none
                min_word_len    = 1
                min_prefix_len  = 3
                html_strip      = 0
                stopwords     = /etc/manticoresearch/stop_words.txt
            blend_chars     = &, @, U+23, U+2D, U+2E, U+AD
    blend_mode      = trim_head, trim_tail
    ignore_chars    = U+27
    charset_table   = 0..9, english
  wordforms       = /etc/manticoresearch/wordforms.txt

      }

index idx_poo_search_full_test:idx_poo_search_full
{
      source      =     src_poo_search_full_test
      path            = /db/manticore/poo_test/poo_search_full
}

#############################################################################
## SOURCE DEFINITION - using main_src_[table_name]
#############################################################################

source src_foo_search_full
{

                type                      =     tsvpipe
                tsvpipe_command           =     zcat /db/manticore/foo/foo_search_full.gz          

                tsvpipe_attr_bigint       =     foo_id     
                tsvpipe_field_string      =     first_name
                tsvpipe_field_string      =     middle_name
                tsvpipe_field_string      =     last_name
                tsvpipe_field_string      =     current_job_title  
                tsvpipe_field_string      =     past_job_title
                tsvpipe_field_string      =     current_level
                tsvpipe_field_string      =     past_level
                tsvpipe_field_string      =     current_department
                tsvpipe_field_string      =     past_department
                tsvpipe_field_string      =     current_business_name
                tsvpipe_field_string      =     past_business_name
                tsvpipe_field_string      =     current_industry
                tsvpipe_field_string      =     past_industry
                tsvpipe_field_string      =     skill
                tsvpipe_field_string      =     current_street_address
                tsvpipe_field_string      =     past_street_address
                tsvpipe_field_string      =     current_city
                tsvpipe_field_string      =     past_city
                tsvpipe_field_string      =     current_state
                tsvpipe_field_string      =     past_state
                tsvpipe_field_string      =     current_zip
                tsvpipe_field_string      =     past_zip
                tsvpipe_field_string      =     p_city
                tsvpipe_field_string      =     p_state
                tsvpipe_attr_uint         =     levelrank
                tsvpipe_attr_float        =     contactscore 
                tsvpipe_attr_bool         =     has_contact 
                tsvpipe_field_string      =     phone        
                tsvpipe_attr_bool         =     has_phone
                tsvpipe_field_string      =     email        
                tsvpipe_attr_bool         =     has_email
                tsvpipe_field_string      =     linkedin        
                tsvpipe_attr_bool         =     has_linkedin

}

#source src_foo_search_full_test:src_foo_search_full
#{
#      tsvpipe_command   =     zcat /db/manticore/foo_dedupe/foo_dedupe.gz
#}

#############################################################################
## index definition
#############################################################################

index idx_foo_search_full
      {

                source              = src_foo_search_full
                path                = /db/manticore/foo/idx_foo_search_full
                dict                = keywords
                morphology          = none
                min_word_len        = 1
                min_prefix_len      = 3
                html_strip          = 0
                stopwords          = /etc/manticoresearch/stop_words.txt             
                wordforms           = /etc/manticoresearch/wordforms.txt
                #morphology         = lemmatize_en_all
                #index_exact_words  = 1
                #min_stemming_len   = 4

      }

#index idx_foo_search_full_test:idx_foo_search_full
#{
#      source      =     src_foo_search_full_test
#      path            = /db/manticore/foo_dedupe/idx_foo_search_full
#}

#############################################################################
## SOURCE DEFINITION - using main_src_[table_name]
#############################################################################

source src_foo_seo
{

                type                      =     tsvpipe
                tsvpipe_command           =     zcat /db/manticore/foo/foo_seo.gz          

                tsvpipe_field_string      =     lnameinitial
                tsvpipe_field_string      =     lastname
                tsvpipe_attr_string       =     firstname
                tsvpipe_field_string      =     location
                tsvpipe_field_string      =     state
                tsvpipe_attr_uint         =     popularity
                tsvpipe_field_string      =     searchtype

}

index idx_foo_seo
      {

                source              = src_foo_seo
                path                = /db/manticore/foo/idx_foo_seo
                dict                = keywords
                morphology          = none
        min_prefix_len      = 1
                html_strip          = 0

      }

#############################################################################
## SOURCE DEFINITION - using main_src_[table_name]
#############################################################################

source src_foo_seo_city
{

                type                      =     tsvpipe
                tsvpipe_command           =     zcat /db/manticore/foo/foo_seo_city.gz          

                tsvpipe_field_string      =     lnameinitial
                tsvpipe_field_string      =     lastname
                tsvpipe_attr_string       =     firstname
                tsvpipe_attr_string       =     company
                tsvpipe_attr_string       =     title
                tsvpipe_field_string      =     location
                tsvpipe_field_string      =     state
        tsvpipe_attr_bigint       =     poseidonid
                tsvpipe_field_string      =     searchtype

}

index idx_foo_seo_city
      {

                source              = src_foo_seo_city
                path                = /db/manticore/foo/idx_foo_seo_city
                dict                = keywords
                morphology          = none
        min_prefix_len      = 1
                html_strip          = 0

      }

#############################################################################
## SOURCE DEFINITION - using main_src_[table_name]
#############################################################################

source src_foo_seo_first
{

                type                      =     tsvpipe
                tsvpipe_command           =     zcat /db/manticore/foo/foo_seo_first.gz          

                tsvpipe_field_string      =     lnameinitial
                tsvpipe_field_string      =     lastname
                tsvpipe_attr_string       =     firstname
                tsvpipe_attr_string       =     company
                tsvpipe_attr_string       =     title
                tsvpipe_field_string      =     location
                tsvpipe_field_string      =     state
                tsvpipe_attr_uint         =     popularity
        tsvpipe_attr_bigint       =     poseidonid
                tsvpipe_field_string      =     searchtype

}

table idx_foo_seo_first
      {

                source              = src_foo_seo_first
                path                = /db/manticore/foo/idx_foo_seo_first
                dict                = keywords
                morphology          = none
        min_prefix_len      = 1
                html_strip          = 0

      }

#############################################################################
## SOURCE DEFINITION - using main_src_[table_name]
#############################################################################

source src_foo_seo_state
{

                type                      =     tsvpipe
                tsvpipe_command           =     zcat /db/manticore/foo/foo_seo_state.gz          

                tsvpipe_field_string      =     lnameinitial
                tsvpipe_field_string      =     lastname
                tsvpipe_attr_string       =     firstname
                tsvpipe_attr_string       =     company
                tsvpipe_attr_string       =     title
                tsvpipe_field_string      =     location
                tsvpipe_field_string      =     state
                tsvpipe_attr_uint         =     popularity
        tsvpipe_attr_bigint       =     poseidonid
                tsvpipe_field_string      =     searchtype

}

table idx_foo_seo_state
      {

                source              = src_foo_seo_state
                path                = /db/manticore/foo/idx_foo_seo_state
                dict                = keywords
                morphology          = none
        min_prefix_len      = 1
                html_strip          = 0

      }

#############################################################################
## SOURCE DEFINITION - using main_src_[table_name]
#############################################################################

source src_foo_test_seo
{

                type                      =     tsvpipe
                tsvpipe_command           =     zcat /db/manticore/foo/foo_test_seo.gz          

        tsvpipe_field_string      =     hashval
                tsvpipe_attr_string       =     lnameinitial
                tsvpipe_attr_string       =     lastname
                tsvpipe_attr_string       =     firstname
                tsvpipe_attr_string       =     company
                tsvpipe_attr_string       =     title
                tsvpipe_attr_string       =     location
                tsvpipe_attr_string       =     state
                tsvpipe_attr_uint         =     popularity
                tsvpipe_attr_string       =     searchtype

}

table idx_foo_test_seo
      {

                source              = src_foo_test_seo
                path                = /db/manticore/foo/idx_foo_test_seo
                dict                = keywords
                morphology          = none
        min_prefix_len      = 1
                html_strip          = 0

      }

#############################################################################
## indexer settings
#############################################################################

indexer
{
    mem_limit       = 2047M
        write_buffer            = 1G
}

#############################################################################
## searchd settings
#############################################################################

searchd
{
    listen                          = 9312:sphinx
    listen                          = 9306:mysql41
    listen                                                  = 9307:mysql_vip
    listen                          = 9308:http

    listen                                                  = 127.0.0.1:9312:sphinx
    listen                                                  = 127.0.0.1:9306:mysql41
    listen                                                  = 127.0.0.1:9307:mysql_vip
    listen                                                  = 127.0.0.1:9308:http

    log                         = /var/log/manticore/searchd.log
    query_log                       = /var/log/manticore/query.log
        query_log_mode                                  = 644
    client_timeout                  = 500
    pid_file                    = /var/run/manticore/searchd.pid
    unlink_old                  = 1
    max_packet_size                 = 50M
    max_filters                 = 256
    max_filter_values               = 4096
    max_batch_queries               = 32
    agent_query_timeout                             = 30000 
access_plain_attrs              = mmap_preread
    binlog_path                                     =
        docstore_cache_size                             = 512M
        net_workers                                     = 2
        qcache_max_bytes                                = 67108864
        read_buffer_docs                                = 512K
        read_buffer_hits                                = 512K
#        threads                                         = 40        
        pseudo_sharding                                 = 1
        query_log_format                                = sphinxql
mysql_version_string = 5.7.1
not_terms_only_allowed = 1
shutdown_timeout = 5
agent_retry_count = 2
}

index idx_foo_search_full_20230908:idx_foo_search_full
        {
            path  = /db/manticore/foo/20230908/idx_foo_search_full
        }

index foo_seo_city_20230908:foo_seo_city
        {
            path  = /db/manticore/foo/20230908/foo_seo_city
        }

index foo_seo_first_20230908:foo_seo_first
        {
            path  = /db/manticore/foo/20230908/foo_seo_first
        }

index foo_seo_20230908:foo_seo
        {
            path  = /db/manticore/foo/20230908/foo_seo
        }

index foo_seo_opts_20230908:foo_seo_opts
        {
            path  = /db/manticore/foo/20230908/foo_seo_opts
        }

index foo_seo_state_20230908:foo_seo_state
        {
            path  = /db/manticore/foo/20230908/foo_seo_state
        }
kalafaye commented 1 year ago

Thank you so much for the formatting. I didn't notice the update until now. I'm hoping you guys are closer to figuring it out than I am...

kalafaye commented 1 year ago

I've tried tons of different things. I've cleaned out every drop of manticore and added it back.. etc.. It SEEMS like the issue only happens when the scripts are being created dynamically ( although I think that may be a red herring ). Anywho... I finally got one to work by only having one index ( no segmentation faults on checkconfig/start) but it never starts. It tries to start forEVER. I left it to 'start' ( it was stuck in reallocation ) and it never finished. Is there a better way to debug or logs I'm missing. As soon as this started the serachd.log was wiped and nothing has been going there since.

sanikolaev commented 1 year ago

but it never starts

Do you mean indexer never starts?

sanikolaev commented 1 year ago

As soon as this started the serachd.log was wiped

This is really strange as searchd doesn't do that. Indexer doesn't know about the searchd log at all. It could be some logrotated which rotated the log.

sanikolaev commented 1 year ago

I can't reproduce the crash if I run indexer like this:

snikolaev@dev2:~/issue_1480$ indexer -c manti.conf --all
Manticore 6.2.13 85ffbbbb8@230929 dev (columnar 2.2.5 b8be4eb@230928) (secondary 2.2.5 b8be4eb@230928)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2023, Manticore Software LTD (https://manticoresearch.com)

using config file '/home/snikolaev/issue_1480/manti.conf'...
FATAL: failed to parse config file '/home/snikolaev/issue_1480/manti.conf': ERROR: inherited section 'foo_seo_city_20230908': parent doesn't exist (parent name='index', type='(null)') in /home/snikolaev/issue_1480/manti.conf line 469 col 9.

snikolaev@dev2:~/issue_1480$

Can you please attach the proper config file which has "index foo_seo_city" declaration.

Best of all if you can provide a minified version which still reproduces the crash. FYI the very minimal version can look so:

searchd {
    listen = 9315:mysql41
    log = searchd.log
    pid_file = searchd.pid
    binlog_path =
}

source src {
    type = csvpipe
    csvpipe_command = echo "1,abc" && echo "2,abc" && echo "3,abc abc"
    csvpipe_field = f
}

index idx {
    type = plain
    source = src
    path = idx
}
kalafaye commented 1 year ago

This is the one that works and 'preallocates' for far too long. But honestly, I'd like to be able to know what caused the issue with the segmentation fault. if it's our setup size i'd be so happy to be able to find/show that so we can ask for more space if we need it. lol

`source src_foo

{ type = tsvpipe tsvpipe_command = zcat /db/manticore/foo_v3/foo_search.gz

            tsvpipe_attr_bigint  =  poseidonid
    tsvpipe_field_string =  fooname
    #tsvpipe_field_string =  alt_fooname
    tsvpipe_attr_bigint  =  city_hash
    tsvpipe_attr_bigint  =  stateid
    tsvpipe_attr_bigint  =  housenumber_hash
    tsvpipe_attr_bigint  =  streetname_hash
    tsvpipe_attr_bigint  =  unitnumber_hash
    tsvpipe_attr_bigint  =  zipcode_hash
    tsvpipe_attr_bigint  =  county_hash
    tsvpipe_field_string =  type_field
    tsvpipe_field_string =  taxid
    tsvpipe_attr_uint    =  foo_type 
    tsvpipe_attr_uint    =  foo_type_bit 

}

#############################################################################

index definition

#############################################################################

index idx_foo { type = plain dict = keywords morphology = none

blend_chars = &, @, U+23, U+2D, U+2E, U+AD

#blend_mode      = trim_head, trim_tail

ignore_chars = U+27

#charset_table   = 0..9, english
    stopwords               = /etc/manticoresearch/stop_words.txt #removed en
    source                  = src_foo

path = /db/manticore/foo_v3/idx_foo wordforms = /etc/manticoresearch/foo_wordforms.txt min_prefix_len = 3

}

indexer { mem_limit = 2047M write_buffer = 1G }

Searchd settings

searchd { listen = 9312:sphinx listen = 9306:mysql41 listen = 9307:mysql_vip listen = 9308:http

listen = 127.0.0.1:9312:sphinx
listen = 127.0.0.1:9306:mysql41
listen = 127.0.0.1:9307:mysql_vip
listen = 127.0.0.1:9308:http

access_plain_attrs  =       mmap_preread
agent_query_timeout =       30000
agent_retry_count   =       2
binlog_path =
client_timeout      =       500
docstore_cache_size =       512M
log =       /var/log/manticore/searchd.log
max_batch_queries   =       32
max_filter_values   =       4096
max_filters =       256
max_packet_size     =       50M
mysql_version_string        =       5.7.1
net_workers =       2
not_terms_only_allowed      =       0
pid_file    =       /var/run/manticore/searchd.pid
pseudo_sharding     =       1
qcache_max_bytes    =       67108864
query_log   =       /var/log/manticore/query.log
query_log_format    =       sphinxql
query_log_mode      =       644
read_buffer_docs    =       512K
read_buffer_hits    =       512K
shutdown_timeout    =       5
unlink_old  =       1

}`

kalafaye commented 1 year ago

new 47.txt

there is a ' right after source on that.. but that was from when I was trying send it to you guys via code type

sanikolaev commented 1 year ago

The config you've attached with the fix of "' right after source", so it looks so:

``` snikolaev@dev2:~/issue_1480$ cat manti.conf source src_foo { type = tsvpipe tsvpipe_command = zcat /db/manticore/foo_v3/foo_search.gz tsvpipe_attr_bigint = poseidonid tsvpipe_field_string = fooname #tsvpipe_field_string = alt_fooname tsvpipe_attr_bigint = city_hash tsvpipe_attr_bigint = stateid tsvpipe_attr_bigint = housenumber_hash tsvpipe_attr_bigint = streetname_hash tsvpipe_attr_bigint = unitnumber_hash tsvpipe_attr_bigint = zipcode_hash tsvpipe_attr_bigint = county_hash tsvpipe_field_string = type_field tsvpipe_field_string = taxid tsvpipe_attr_uint = foo_type tsvpipe_attr_uint = foo_type_bit } ############################################################################# ## index definition ############################################################################# index idx_foo { type = plain dict = keywords morphology = none #blend_chars = &, @, U+23, U+2D, U+2E, U+AD #blend_mode = trim_head, trim_tail #ignore_chars = U+27 #charset_table = 0..9, english stopwords = /etc/manticoresearch/stop_words.txt #removed en source = src_foo path = /db/manticore/foo_v3/idx_foo wordforms = /etc/manticoresearch/foo_wordforms.txt min_prefix_len = 3 } indexer { mem_limit = 2047M write_buffer = 1G } # Searchd settings searchd { listen = 9312:sphinx listen = 9306:mysql41 listen = 9307:mysql_vip listen = 9308:http listen = 127.0.0.1:9312:sphinx listen = 127.0.0.1:9306:mysql41 listen = 127.0.0.1:9307:mysql_vip listen = 127.0.0.1:9308:http access_plain_attrs = mmap_preread agent_query_timeout = 30000 agent_retry_count = 2 binlog_path = client_timeout = 500 docstore_cache_size = 512M log = /var/log/manticore/searchd.log max_batch_queries = 32 max_filter_values = 4096 max_filters = 256 max_packet_size = 50M mysql_version_string = 5.7.1 net_workers = 2 not_terms_only_allowed = 0 pid_file = /var/run/manticore/searchd.pid pseudo_sharding = 1 qcache_max_bytes = 67108864 query_log = /var/log/manticore/query.log query_log_format = sphinxql query_log_mode = 644 read_buffer_docs = 512K read_buffer_hits = 512K shutdown_timeout = 5 unlink_old = 1 } ```

works so:

snikolaev@dev2:~/issue_1480$ indexer -c manti.conf --all
Manticore 6.2.13 85ffbbbb8@230929 dev (columnar 2.2.5 b8be4eb@230928) (secondary 2.2.5 b8be4eb@230928)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2023, Manticore Software LTD (https://manticoresearch.com)

using config file '/home/snikolaev/issue_1480/manti.conf'...
indexing table 'idx_foo'...
WARNING: failed to load stopwords from either '/etc/manticoresearch/stop_words.txt' or '/usr/share/manticore/stopwords/stop_words.txt'
FATAL: failed to open /db/manticore/foo_v3/idx_foo.spl: No such file or directory, will not index. Try --rotate option.
snikolaev@dev2:~/issue_1480$ gzip: /db/manticore/foo_v3/foo_search.gz: No such file or directory

Please provide the following files:

/db/manticore/foo_v3/foo_search.gz
/etc/manticoresearch/stop_words.txt
/etc/manticoresearch/foo_wordforms.txt

If they contain sensitive data, feel free to use our write-only S3 storage - https://manual.manticoresearch.com/Reporting_bugs#Uploading-your-data

tomatolog commented 1 year ago

there is a ' right after source on that.. but that was from when I was trying send it to you guys via code type

using you config with error symbol I see the error message and correct indexer exit

Manticore 6.2.13 51ecd20e0@231004 dev (columnar 2.2.5 b8be4eb@230928) (secondary 2.2.5 b8be4eb@230928)

using config file '/home/stas/bin/gh1480/c1.conf'...
FATAL: failed to parse config file '/home/stas/bin/gh1480/c1.conf': ERROR: named section: expected name, got '`' in /home/stas/bin/gh1480/c1.conf line 1 col 7.

there is no crash

tomatolog commented 1 year ago

maybe crash related to OS you run indexer - could you try to recreate it in the docker container we provided for release or dev versions?

kalafaye commented 1 year ago

So I've rewrote everything and I'm pretty sure it's coming from the fact that we upgraded without purging. But I did re-write a ton of stuff too... If I figure out something else, or get the error again, I'll post the info here. Just in case there's some weird bug it might help find. Thanks for all your help.