Closed d-maurer closed 5 years ago
@andbag
* The query parameter "range" cannot be optmized by CompositeIndex and should not rewrite the query. The structure of the "composite key" does not support such queries.
You could have a look at "https://github.com/d-maurer/Products.ZCatalog/blob/resultset_intersection%2355/src/Products/PluginIndexes/unindex.py#L509". For an UnIndex
, it transforms (among others) a range
query into (the equivalent of) a normal or
query. It essentially translates the range
specification into the equivalent key set.
I do not pretend that CompositeIndex
should optimize range
queries - as the resulting key set may be huge. Nevertheless, it may be handy to remember that for an unspecialized index, a range
query is simply an or
query with a key set specified in an indirect way.
PR https://github.com/zopefoundation/Products.ZCatalog/pull/66 will fix the issue.
CompositeIndex
analyses a query and rewrites it if the index thinks it can optimize the query. Due to the rewriting it is necessary that it recognizes reliably the cases were the rewrite leads to a semantically equivalent query.Currently,
CompositeIndex
operates on the query provided keys only and does not access its (true) component indexes. As a consequence, it cannot optimize a subquery when the relevant keys depend on the true component index, as this is the case for "range" queries.CompositeIndex
can currently wrongly optimize queries containing "range" subqueries - as demonstrated by the example below. In #57, I have provided the new methodget_combiner_keys_info
which preprocesses a query (record) and determines the keys effectively used by the query. IfCompositeIndex
were ready to access the true composite index, then it could be used to remove the restriction.CompositeIndex
cannot optimize subqueries containingnot
. It knows this and tries to recognize the case -- but its condition is wrong (actually, it checks for a "pure not", not a "general not"). As a consequence, it usually produces wrong optimizations for subqueries containing "not" (ignoring the "not") -- example below.CompositeIndex
could in principle optimize subqueries with "not". What it essentially does is replace "Or(ai) and Or(bj)" with "Or(ai and bj)". It could also replace "(Or(ai) and NA) and (Or(bj) and NB)" (where the "NA" and "NB" represent optional "not" parts) with "Or(ai and bj) and NA and NB". If not missing, "NA" would be a "pure not". Such "not"s are costly. However, in the case above, a "filtering not" would be sufficient (provided it comes after the big "Or") as we know that any relevant document in indexed by "A" (similar for "B"). To support this efficiently, we would need to introduce "filtering queries" and ensure they are applied after the "normal" queries (note:AdvancedQuery
supports filtering queries).The following example demonstrates buggy optimizations by
CompositeIndex
: