manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
9.09k stars 509 forks source link

Manticore crashes with `signal 11` when inserting data #1891

Open yharahuts opened 9 months ago

yharahuts commented 9 months ago

Describe the bug Manticore crashes when inserting a large amount of data into index.

Manticore is running in rt mode with following tables:

CREATE TABLE redacted_aaa (
id bigint,
entity_id text stored,
name text indexed,
description text indexed,
notes text indexed,
number text indexed,
holder text indexed,
is_deleted bool,
schema string attribute,
source string attribute,
attributes json
) min_infix_len='2' index_exact_words='1' charset_table='non_cjk, U+47->0' min_word_len='2' blend_chars='@, /, +, -, ., _' blend_mode='trim_none, trim_both, skip_pure' morphology='stem_ru' min_stemming_len='3' expand_keywords='1'

Data is inserted via (rather large?) batches of 500 records per single insert, and whole dataset contains about 100m rows splitted into 1-3 indexes. Crash happens randomly, data can be inserted without problems at all, or can crash at ~1-2% at random line.

Since it is prod instance, I'm afraid I can not give you our datasets, or test multiple (older?) manticore versions.

To Reproduce Steps to reproduce the behavior:

  1. Create the tables;
  2. Insert large dataset via batches;
  3. Randomly get a crash;

Expected behavior It should not crash.

Describe the environment:

Messages from log files: docker logs shows following:

rt: table redacted_aaa: diskchunk 7(8), segments 30  saved in 20.478723 (20.479030) sec, RAM saved/new 127530839/2099120 ratio 0.950000 (soft limit 127506841, conf limit 134217728)
rt: table redacted_bbb: diskchunk 5(6), segments 31  saved in 5.438967 (5.440291) sec, RAM saved/new 127699405/232936 ratio 0.950000 (soft limit 127506841, conf limit 134217728)
# ~20-30 of similar lines 
Crash!!! Handling signal 11

After that it restarts with:

binlog: replaying log /var/lib/manticore/binlog/binlog.011
binlog: table redacted_aaa: recovered from tid 23457 to tid 23487
binlog: table redacted_bbb: recovered from tid 5380 to tid 5418
binlog: replay stats: 316 commits; 0 updates, 0 reconfigure; 0 pq-add; 0 pq-delete; 0 pq-add-delete, 2 tables
binlog: finished replaying /var/lib/manticore/binlog/binlog.011; 85.2 MB in 0.551 sec
binlog: finished replaying total 1 in 0.552 sec

Additional context While writing this issue, I came up with two ideas:

I'll try both options, but since crash is happening randomly - I couldnt guarantee it will work or not.

Any advices is greatly appreciated,

indextool --check on both indexes returns:

# other chunks gives same output
checking disk chunk, extension 16, 16(17)...
WARNING: secondary library not loaded; secondary index(es) disabled
checking schema...
checking dictionary...
checking data...
checking rows...
checking attribute blocks index...
checking kill-list...
checking docstore...
checking dead row map...
checking doc-id lookup...
check passed, 196.3 sec elapsed
check passed, 196.3 sec elapsed
sanikolaev commented 9 months ago

I can't reproduce a crash in 6.2.12 with the loading script based on your schema:

``` #!/usr/bin/php \n"); // This function waits for an idle mysql connection for the $query, runs it and exits function process($query) { global $all_links; global $requests; foreach ($all_links as $k=>$link) { if (@$requests[$k]) continue; mysqli_query($link, $query, MYSQLI_ASYNC); @$requests[$k] = microtime(true); return true; } do { $links = $errors = $reject = array(); foreach ($all_links as $link) { $links[] = $errors[] = $reject[] = $link; } $count = @mysqli_poll($links, $errors, $reject, 0, 1000); if ($count > 0) { foreach ($links as $j=>$link) { $res = @mysqli_reap_async_query($links[$j]); foreach ($all_links as $i=>$link_orig) if ($all_links[$i] === $links[$j]) break; if ($link->error) { echo "ERROR: {$link->error}\n"; if (!mysqli_ping($link)) { echo "ERROR: mysql connection is down, removing it from the pool\n"; unset($all_links[$i]); // remove the original link from the pool unset($requests[$i]); // and from the $requests too } return false; } if ($res === false and !$link->error) continue; if (is_object($res)) { mysqli_free_result($res); } $requests[$i] = microtime(true); mysqli_query($link, $query, MYSQLI_ASYNC); // making next query return true; } }; } while (true); return true; } $all_links = []; $requests = []; $c = 0; for ($i=0;$i<$argv[2];$i++) { $m = @mysqli_connect('127.0.0.1', '', '', '', 9306); if (mysqli_connect_error()) die("Cannot connect to Manticore\n"); $all_links[] = $m; } // init mysqli_query($all_links[0], "drop table if exists redacted_aaa"); mysqli_query($all_links[0], "CREATE TABLE redacted_aaa ( id bigint, entity_id text stored, name text indexed, description text indexed, notes text indexed, number text indexed, holder text indexed, is_deleted bool, schema string attribute, source string attribute, attributes json ) min_infix_len='2' index_exact_words='1' charset_table='non_cjk, U+47->0' min_word_len='2' blend_chars='@, /, +, -, ., _' blend_mode='trim_none, trim_both, skip_pure' morphology='stem_ru' min_stemming_len='3' expand_keywords='1'"); $batch = []; $query_start = "insert into redacted_aaa(id, entity_id, name, description, notes, number, holder, is_deleted, schema, source, attributes) values "; echo "preparing...\n"; $error = false; $cache_file_name = '/tmp/'.md5($query_start).'_'.$argv[1].'_'.$argv[3]; $c = 0; if (!file_exists($cache_file_name)) { $batches = []; while ($c < $argv[3]) { $batch[] = "($c, '1234567890', 'john smith', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum', '0123456789', 'holder1 holder2 holder3', 'true', '{\"a\": 123, \"b\": 345}', 'source', '{\"a\": 123, \"b\": {\"c\": 1.2, \"d\": true}}')"; $c++; if (floor($c/1000) == $c/1000) echo "\r".($c/$argv[3]*100)."% "; if (count($batch) == $argv[1]) { $batches[] = $query_start.implode(',', $batch); $batch = []; } } if ($batch) $batches[] = $query_start.implode(',', $batch); file_put_contents($cache_file_name, serialize($batches)); } else { echo "found in cache $cache_file_name\n"; $batches = unserialize(file_get_contents($cache_file_name)); } $batchesMulti = []; for ($n=0;$n<$argv[4];$n++) $batchesMulti = array_merge($batchesMulti, $batches); $batches = $batchesMulti; echo "querying...\n"; $t = microtime(true); foreach ($batches as $batch) { if (!process($batch)) die("ERROR\n"); } // wait until all the workers finish do { $links = $errors = $reject = array(); foreach ($all_links as $link) $links[] = $errors[] = $reject[] = $link; $count = @mysqli_poll($links, $errors, $reject, 0, 100); } while (count($all_links) != count($links) + count($errors) + count($reject)); echo "finished inserting\n"; echo "Total time: ".(microtime(true) - $t)."\n"; echo round($argv[3] * $argv[4] / (microtime(true) - $t))." docs per sec\n"; ```

even with the higher concurrency of 8:

# php ~/load_1891.php 500 8 10000000 1
preparing...
100%       querying...
finished inserting
Total time: 832.53445100784
12012 docs per sec
mysql> select count(*) from redacted_aaa;
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (0.00 sec)

There was a somewhat similar issue https://github.com/manticoresoftware/manticoresearch/issues/1458#issuecomment-1790605768 which has already been fixed. I suggest you check if the crash persists in the latest dev version - https://mnt.cr/dev/nightly

You can also try modifying the script, so it reproduces the crash, so we can reproduce it on our end to fix it.

yharahuts commented 9 months ago

@sanikolaev it is happening very randomly, I can load 100Gb of data without any probles at all, or have problems on 15Gb dataset at random point.

It is just like your comment on that issue:

It's also very unstable: sometimes the provided script works fine for the whole night, sometimes it crashes in a minute after started.

I'm currently testing manticoresearch/manticore:dev image - but will need some (rather long) time to test with various data and confirm it is a duplicate and it is fixed.

MirosOwners commented 8 months ago

I dont know if it helps but i managed to get into similar state with two vector fields and columnar engine in the same table

sanikolaev commented 8 months ago

@MirosOwners Do you mean in the same table as in the script here https://github.com/manticoresoftware/manticoresearch/issues/1891#issuecomment-1971707674 ?

yharahuts commented 7 months ago

It stopped crashed on this index with dev version, but started to crash on other index. This time logs are clear, manticore just dies and starts again as if nothing happened.

Edit: as far as I can see, it just slowly overflows all available memory, Any ideas how to debug this?

I've tried adding flush ramchunk during inserts, but no luck.

sanikolaev commented 7 months ago

Edit: as far as I can see, it just slowly overflows all available memory, Any ideas how to debug this?

@yharahuts So it doesn't crash in the dev version, but just an OOM occurs?

bZichett commented 3 days ago

I've been dealing with this sporadic problem for weeks now. I finally found this thread and after review, one comment stood out:

@MirosOwners

I dont know if it helps but i managed to get into similar state with two vector fields and columnar engine in the same table

Although I cannot attest to precisely when the problem started happening, I do know that I somewhat recently (weeks) added vector fields (3 of them, dim = 384, hnsw, l2, to my table.) I cannot recall having this problem before doing so, although I am not positive.

The problem occurs sporadically during large throughput indexing, whether its bulk API or not, the server crashes with signal 11, and upon restart and replaying binlog, also crashes (perpetual crash loop from there) I immediately implemented a sleep mechanism between the batches, which may help but it does not solve the issue. It does not occur when indexing small amounts and I can utilize these 3 vector fields during search time.

I seem to be able to reset it to a stable state with a rm -rf table_name, and then re-indexing smaller amounts or with sleeps added between calls but it appears like I can re-introduce the bug by just throwing data at the instance long enough (I re-index in developer environment frequently and sometimes test large datasets)

My initial test, which I don't see as confirmation, but it inspired me to post this message with this detail:

I just removed the three vector fields from being initialized in the RT CREATE TABLE statement and successfully ran a total reindex on my data set with no sleep between batches of 100, taking 450 seconds total. This is only 23k rows total but some of the fields are large (~1 MB json file). I don't think anything spectacular is going on here - the dataset isnt even too big - 250MB . More important, regards the vector fields, whether or not I generate a value for the manticore index to consume, does not matter - in short, the existence of vector fields may actually have something to do with this, although I can't reason why that would be the case myself, just reporting it with some ~medium confidence that a dev should filter through the above info.

tomatolog commented 3 days ago

if you have a crash loop

and upon restart and replaying binlog, also crashes (perpetual crash loop from there)

it could be better to upload your index files along with binlog to reproduce that crash loop here and fix the issue. You could upload your data as described at manual https://manual.manticoresearch.com/dev/Reporting_bugs#Uploading-your-data

bZichett commented 3 days ago

If maybe I just got lucky and the problem happens again, I can think on how to safely send anonymized data - need to plan the feasibility of the rest of the fields and still there is data that I do not wish to send.

Based on the details of my report: I just wanted to clarify again, although I can't make sense of it, that it points directly to the issue not being the data itself. I am utilizing the entire dataset fine after removing the 3 vector fields during table creation. Finally, in either case, the sporadic bug happens during indexing without even passing values for those vector fields.

I will test further and maybe run an even larger job and report back only if the problem begins again. If the bad state does not happen, since the only change made was not adding these fields, I can immediately pass over the create table statement, although it's just 384 dims, hnsw, l2 x3 vectors fields (and about ~25 other fields.)

@tomatolog or another: If you have a suspicion that the root cause is in fact the data - and i am thus missing something crucial about the code / architecture itself - I'd appreciate you clarifying that as well.

tomatolog commented 3 days ago

all crashes these could be reproduced locally are fixed already or on the way into master branch.

Our team do not have any clue what could cause such crash as we do not have data that reproduces the crash not the crash log from the searchd.log the crash stack could be checked.

bZichett commented 3 days ago

Completely understand.

Is there a single viable thesis on the addition of vector fields to the table? I can try to go back and forth to assert further certainty on this being it. Other than that, I am not sure I can help atm with submitting data; will think on that more.

sanikolaev commented 21 hours ago

It would be much easier if we had at least one of the following:

Without these, it may be hard to resolve the issue. It's best to have all of them, as this significantly improves the chances of finding a solution.