Open huw0 opened 2 months ago
@raunaqmorarka PTAL
This may be related to the inability to push down certain expression into the connector due to intervening implicit casts.
@huw0, can you provide the result of EXPLAIN ANALYZE
for one of the queries that works as you expect and the one that doesn't?
I don't have the explain output handy right now, but it is definitely cast related.
I think the problem is that the casts are removed by the UnwrapCastInComparison
rule.
However this rule is never hit because DesugarBetween
won't convert the query to >= and <= when the query contains a cast.
Adding an additional trivial switch case to DesugarBetween
when the input is a cast solves the problem. However a correct implementation probably needs to be more elegant to confirm that the types are valid for conversion to >= and <=.
As pushdown works when the query is manually converted to WHERE >= x and <= y
I think this is definitely a missing condition in the optimiser.
Summary - Queries using
BETWEEN
result in entire table scans when the timestamp filter is a greater length than the column in the underlying table.This example shows how to reproduce using PostgreSQL, however I believe this impacts all connectors.
Example database:
Example data:
The following queries all work successfully and filter logic is correctly pushed down to PostgreSQL. Only two rows are returned to Trino.
However taking the
BETWEEN
equivalent for the final query:This does a full scan of the table which on large tables are substantially slower.
My expectation is that these final two queries should result in identical query plans.