Open roamye opened 1 month ago
This ticket's scope may be too broad. It's typically better to focus on one timeout / perf issue at a time.
In the case of requesting the last page of a search's results, the frontend is specifying pagination parameters that has the current implementation of the backend perform a search that has to filter results through page * page length. Filtering is the process of pulling the document from disk and through d-node then e-node caches to validate it meets all search criteria. The process may end up dropping some --when indexes alone were unable to apply all of the criteria (unfiltered).
When searching for objects containing "colors", the estimate (unfiltered) comes back 109,127. With cold caches, it took 22.5 seconds to get to the last seven. With warm caches, it still took 16.1 seconds. An unfiltered search consistently returns the same last seven results in 117 milliseconds, illustrating this search can be accurately resolved by indexes alone and that the filtering process is adding 16 or seconds to double check the 109K results.
Things we may want to discuss:
cts.andQuery([
cts.jsonPropertyValueQuery(
'dataType',
['DigitalObject', 'HumanMadeObject'],
['exact']
),
cts.orQuery([
cts.fieldWordQuery(
['itemAnyText'],
'colors',
[
'case-insensitive',
'diacritic-insensitive',
'punctuation-insensitive',
'whitespace-insensitive',
'stemmed',
'wildcarded',
],
1
),
cts.tripleRangeQuery(
[],
[[lux('itemAny')]],
fn.insertBefore(
cts.values(
cts.iriReference(),
'',
['eager', 'concurrent'],
cts.fieldWordQuery(
['referencePrimaryName'],
'colors',
[
'case-insensitive',
'diacritic-insensitive',
'punctuation-insensitive',
'whitespace-insensitive',
'stemmed',
'wildcarded',
],
1
)
),
0,
sem.iri('/does/not/exist')
),
'=',
[],
1
),
]),
]);
I received some internal feedback...
You could have the front end reverse the sort and select the first page when they click the last button
It's a valid idea but I like no. 2 is more comprehensive and doesn't impose on backend endpoint consumers.
16 sec for 100k is 0.16 msec each doc which is pretty quick. Maybe it isn't expected that it is single threaded on the enode but, I believe that is the case for that stage of the query processing.
I confirmed the single-thread belief and submitted https://progressdataplatform.ideas.aha.io/ideas/ML-I-42.
Having said that, I am still much more in favor option no. 1.
@brent-hartwig I agree it would be great to look at option no. 1 - Many of our searches are resolvable only using indexes so should be able to go unfiltered. Would be interesting to see which subset of searches requires filtering and if there is a way to enable it to work unfiltered or just accept the false positives
Would be interesting to see which subset of searches requires filtering and if there is a way to enable it to work unfiltered or just accept the false positives
Part of CTS/Optic batch no. 3 👍 . We are becoming familiar with what cannot be resolved by indexes, including punctuation and whitespace. We need a complete list and decide whether those are important enough to LUX to accept the performance penalty imposed by filtering and prevent us from adopting Optic. Whenever we encounter a false positive that isn't explained by the list, we need to question our index configuration and code.
Problem Description: There have been several time out errors in Advanced Search queries which only have one multiple field. (see example below) This issue serves as a research ticket for all performance enhancements for time out issues not related to #113 to figure out how to solve this issue.
Another time out issue is when users select the last page of a search result. (example below)
Expected Behavior/Solution: Research on how to fix this time out issue. Solution is TBD Possible solution: increase our time out
Requirements: TBD
Needed for promotion: If an item on the list is not needed, it should be crossed off but not removed.
UAT/LUX Examples:
Dependencies/Blocks:
Related Github Issues:
Related links:
Wireframe/Mockup: Place wireframe/mockup for the proposed solution at end of ticket.