Open alanpaxton opened 1 year ago
@alanpaxton You can tell if an index is being used to evaluate a query by enabling tracing in monex's profiling pane: http://localhost:8080/exist/apps/monex/profiling.html. Enable tracing, clear, tare, run your query, then refresh and check the index usage tab.
It sounds like the query optimizer isn't recognizing the ability of this query to be optimized. Tips for achieving the greatest optimizations can be found in https://exist-db.org/exist/apps/doc/tuning. Certainly there is room to make improvements to the query optimizer (and corresponding updates to the article).
@alanpaxton You can tell if an index is being used to evaluate a query by enabling tracing in monex's profiling pane: http://localhost:8080/exist/apps/monex/profiling.html. Enable tracing, clear, tare, run your query, then refresh and check the index usage tab.
It sounds like the query optimizer isn't recognizing the ability of this query to be optimized. Tips for achieving the greatest optimizations can be found in https://exist-db.org/exist/apps/doc/tuning. Certainly there is room to make improvements to the query optimizer (and corresponding updates to the article).
Thanks Joe, that's info I didn't have which should help me dig out the problem.
It would be interesting to know if this PR when merged also makes a difference to your issue @alanpaxton - https://github.com/eXist-db/exist/pull/4989
Describe the bug A large corpus with the following index configured:
and re-indexed thus:
For a simple query, the index is being used:
This gives me the helpful result that range indices for @lemma are indexing effectively, and results are strongly dependent on the frequency of the terms:
but using the alternative query (as originally proposed by the owners of the corpus):
and we see the following times:
Expected behaviour The second query should run in a similar time to the first; the fact that it is much slower suggests that the index is not being used in this case.
To Reproduce Unfortunately a copy of a large corpus (30M words) was used to reproduce the results. It may be possible to understand the problem without the corpus. It may be possible for us to supply the corpus privately upon request.
Additional context
conf.xml
? no