Closed nicolasfranck closed 4 years ago
version 0.0502 solves this for Elasticsearch by introducing the search_strategy setting. The default is 'paginate'. When set to 'es.scroll' and using Catmandu::Store::ElasticSearch > 1.02 deep paging is avoided. We could introduce a 'solr.search_after' strategy to address the problem with Solr. 'es.search_after' could also be implemented.
Solr and ElasticSearch suffer from a deep paging problem:
cf. https://lucidworks.com/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/
This means: the deeper you get into the results, the slower the response. The culprit is the sorting.
A better way is:
The filter makes sure that previous records are not included in the hits ( "{" means "exclusive", "]" means "inclusive" ), so Solr never needs to sort more than "limit" records.
This way the walltime does not increase rapidly, but remains stable.
How to add this filter while OAI is ignorant about the bag implementation?