As of now this is officially only a hunch, as there is no reproducer yet. But, my hunch is that with how the last cache is handling predicates, if a query contains multiple predicates on a single column, it will not properly handle them.
For example,
SELECT * FROM last_cache('foo') WHERE bar = 'baz' OR bar = 'bop'
will only evaluate using one of the bar = predicates, instead of both in combination.
The fix should be to combine expressions in such a scenario to become:
SELECT * FROM last_cache('foo') WHERE bar IN ('baz', 'bop')
Other scenarios that need to be considered are where multiple incompatible predicates are provided, for example,
SELECT * FROM last_cache('foo') WHERE bar = 'baz' AND bar = 'bop'
SELECT * FROM last_cache('foo') WHERE bar = 'baz' OR bar != 'baz'
Should either be an error, or ignored, so DataFusion can handle it.
As of now this is officially only a hunch, as there is no reproducer yet. But, my hunch is that with how the last cache is handling predicates, if a query contains multiple predicates on a single column, it will not properly handle them.
For example,
will only evaluate using one of the
bar =
predicates, instead of both in combination.The fix should be to combine expressions in such a scenario to become:
Other scenarios that need to be considered are where multiple incompatible predicates are provided, for example,
Should either be an error, or ignored, so DataFusion can handle it.