Closed glukkkk closed 3 years ago
Thank you for such a clean reproducible case!
@sanikolaev Is this our fix?
https://github.com/manticoresoftware/manticoresearch/commit/2a474dc1a26e8b0f8aaaae95669caf2f1d4b7746
@sanikolaev Is this our fix?
No, it's for another crash.
Ok, we're waiting :)
➤ Sergey Nikolaev commented:
I'm not sure we'll be able to address it soon as this crash is caused by a wrong use of Manticore - you are not supposed to do:
You are supposed to use attributes for that, e.g. string attribute in your case:
mysql> create table crash (data_id string, period text) morphology='stem_enru' rt_mem_limit='1024M' index_exact_words='1' expand_keywords='1' min_infix_len='3' charset_table='0..9, english, russian, _';
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO crash (data_id, period) VALUES ('00000000-0000-0000-0000-000000000000', '5');
Query OK, 1 row affected (0.00 sec)
mysql> SELECT data_id, COUNT(DISTINCT data_id) AS cnt FROM crash WHERE MATCH('(@period \"5\")') AND data_id <> '00000000-0000-0000-0000-000000000000' OPTION ranker = 'wordcount';
+---------+------+
| data_id | cnt |
+---------+------+
| NULL | 0 |
+---------+------+
1 row in set, 1 warning (0.01 sec)
FFR: minimized test which crashes:
Ok, the truth is that we do not need the data_id <> '00000000-0000-0000-0000-000000000000 condition. You can simplify the query to this one:
SELECT data_id, COUNT(DISTINCT data_id) AS cnt FROM testrt WHERE MATCH('(@period \"5\")') OPTION ranker = 'wordcount';
And it will crash too. We need the data_id to be indexed and stored as long as we need to return it in SELECT queries and search by data_id using UUIDs in conditions (MATCH will be faster here rather than using = operator).
BTW, in your minimized example you can use the following query and you will get the same problem:
SELECT t, COUNT(distinct t) AS cnt FROM crash;
So, it is a bug!
@glukkkk You can't do count(<non-attribute>)
or count(distinct <non-attribute>)
. If you need COUNT(DISTINCT data_id)
then you should use an attribute, not a stored full-text field (like in your case - data_id text
). If you want data_id to be a full-text field and a string attribute at the same time you are supposed to define it as data_id string attribute indexed
.
@sanikolaev Thank you for the clarification! It's OK for us to modify the config.
However, I suggest that you should throw some SQL exception in this case (something like COUNT(DISTINCT field) supports only attributes), but not crash the searchd service.
You can close the issue. Thank you.
➤ Aleksey N. Vinogradov commented:
However, this very case includes following different defects (leaks of abstractions):
SELECT COUNT(DISTINCT data_id) AS cnt FROM testrt;
SELECT data_id, COUNT(DISTINCT data_id) AS cnt FROM testrt;
mysql> SELECT id, data_id FROM testrt WHERE data_id <> '00000000-0000-0000-0000-000000000000';
+---------------+---------+
| id | data_id |
+---------------+---------+
| 1677721600001 | |
+---------------+---------+
No segfault, but also no filtering.
mysql> SELECT id, data_id FROM testrt;
+---------------+--------------------------------------+
| id | data_id |
+---------------+--------------------------------------+
| 1677721600001 | 00000000-0000-0000-0000-000000000000 |
+---------------+--------------------------------------+
reference to previous case - data_id is actually present.
mysql> SELECT * FROM testrt WHERE data_id <> '00000000-0000-0000-0000-000000000000' AND period <> '5';
+---------------+---------+--------+
| id | data_id | period |
+---------------+---------+--------+
| 1677721600001 | | |
+---------------+---------+--------+
No segfault, but weird output.
mysql> SELECT * FROM testrt WHERE data_id = '00000000-0000-0000-0000-000000000000';
Empty set (0.00 sec)
No segfault, but weird output.
Describe the bug
Manticore crashes on a simple query.
To Reproduce
Just execute these queries and you will see the problem:
NSERT INTO testrt (data_id, period) VALUES ('00000000-0000-0000-0000-000000000000', '5');`