influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
29k stars 3.56k forks source link

Certain combinations of filter expressions may produce unexpected query results #25565

Open hiltontj opened 4 days ago

hiltontj commented 4 days ago

As of now this is officially only a hunch, as there is no reproducer yet. But, my hunch is that with how the last cache is handling predicates, if a query contains multiple predicates on a single column, it will not properly handle them.

For example,

SELECT * FROM last_cache('foo') WHERE bar = 'baz' OR bar = 'bop'

will only evaluate using one of the bar = predicates, instead of both in combination.

The fix should be to combine expressions in such a scenario to become:

SELECT * FROM last_cache('foo') WHERE bar IN ('baz', 'bop')

Other scenarios that need to be considered are where multiple incompatible predicates are provided, for example,

SELECT * FROM last_cache('foo') WHERE bar = 'baz' AND bar = 'bop'
SELECT * FROM last_cache('foo') WHERE bar = 'baz' OR bar != 'baz'

Should either be an error, or ignored, so DataFusion can handle it.