Closed oruchreis closed 4 years ago
Hello oruchreis, thanks for reporting the issue. We'll take a look at it in the nearest future.
Hi, We can't reproduce this issue with simple data set. Plz provide more details about this issue: Indexes structure, data structure, exact query, query explain ouptut
Hi,
Sorry I forgot to mention about built environment. I could reproduce this issue with windows build(MSVC) of standalone reindexer server. In linux and osx build I can't reproduce this issue neither. I simply cloned latest master branch to my local and built x64-Debug(I've tried Release also) with Visual Studio 2019. I've only enabled "LINK_RESOURCES" option for face and swagger. Here are the structures:
Namespace: Test
Indexes:
{
"total_items": 1,
"items": [
{
"name": "Id",
"field_type": "int64",
"index_type": "hash",
"is_pk": true,
"is_array": false,
"is_dense": true,
"is_sparse": false,
"collate_mode": "none",
"sort_order_letters": "",
"expire_after": 0,
"config": {},
"json_paths": [
"Id"
]
}
]
}
Data:
{
"items": [
{
"Id": 1
},
{
"Id": 2
}
],
"namespaces": [
"Test"
],
"cache_enabled": true,
"total_items": 2
}
The query that fails:
SELECT Id FROM Test WHERE Id IN (1,1,1,1)
If I add EXPLAIN
to start of the query, the application crashes.
I've also tested x64-release build and the setup file from appveyor artifacts (which afaik it is VS 2017 and x64-release). In both release version the query failed like before, but I could get the explain result without crash. Here are explain outputs of the failed query:
{
"items": [],
"namespaces": [
"Test"
],
"cache_enabled": true,
"explain": {
"total_us": 56,
"prepare_us": 25,
"indexes_us": 21,
"postprocess_us": 6,
"loop_us": 3,
"general_sort_us": 0,
"sort_index": "-",
"sort_by_uncommitted_index": false,
"selectors": [
{
"items": 2,
"matched": 2,
"method": "scan",
"type": "SingleRange"
},
{
"field": "Id",
"keys": 0,
"comparators": 1,
"cost": 3,
"matched": 0,
"method": "scan",
"type": "OnlyComparator"
}
]
}
}
Actually problem looks like tsl::hopscotch_set or MSVC compiler bug. I wrote issue to tsl::hopscotch_set repo - https://github.com/Tessil/hopscotch-map/issues/52
In the next reindexer release we will introduce workaround: just replace emplace
with insert
here: https://github.com/Restream/reindexer/blob/master/cpp_src/core/comparatorimpl.h#L102
I've figured out that if I set kMaxSelectivityPercentForIdset to 100 in indexunorderd.cc, it works like before v2.9.2. I think there is a bug in this condition in indexunorderd.cc at line 192:
This change can seriously slow down queries execution with wide IN () condition to low selectivity indexes. Actually, kMaxSelectivityPercentForIdset mean, that 'choose scan method instead of index, if scan will be more efficient'
I confirm that it is fixed in v2.11.1. Thanks for your effort. Best Regards.
Hi, After v2.9.2 we can't search small namespaces with IN condition but bigger namespaces works expected. This works on small namespaces:
Id IN (1)
But this doesn't work:Id IN (1,1,1,1)
If items count in the namespace is bigger than 50 or something it works expected. I've figured out that if I setkMaxSelectivityPercentForIdset
to100
inindexunorderd.cc
, it works like before v2.9.2. I think there is a bug in this condition inindexunorderd.cc
at line 192: https://github.com/Restream/reindexer/blob/05757895ae35089133fd96a8ab373520bf4bc03d/cpp_src/core/index/indexunordered.cc#L192Best Regards.