Open erichwang opened 2 years ago
CAST(timestamp_with_tz AS DATE) > DATE '2022-01-01'
per https://github.com/trinodb/trino/issues/12362#issuecomment-1138319041, unwrapping this is incorrect
CAST(timestamp_with_tz AT TIMEZONE 'UTC' AS DATE) > DATE '2022-01-01'
this we can unwrap.
CAST(timestamp_with_tz AS DATE) > DATE '2022-01-01'
per #12362 (comment), unwrapping this is incorrect
It's correct only when we know zone of all the values. For example, in Iceberg we know all timestamp with time zone values are in UTC, so we can unwrap within connector: https://github.com/trinodb/trino/pull/12918
CAST(timestamp_with_tz AT TIMEZONE 'UTC' AS DATE) > DATE '2022-01-01'
this we can unwrap.
thanks for editing the issue description to keep this only. I updated the issue title as well.
Given a SQL predicate that looks like:
we should unwrap it to the form:
-- and similarly for an upperbound. This will allow timestamps with timezones to be passed easily through the SPI in the TupleDomain format for easier connector consumption.
The reason we need to add
AT TIME ZONE <some_tz>
is to ensure that we are comparing against the same DATE literal instant. Otherwise the comparison can't be unwrapped if the column has a bunch of values with different time zones (since the DATEs for them would all refer to different timestamp instances).Why? Timestamp with TZ is a very common storage format for computer event time, particularly those formats using fixed hour offsets (e.g. '-07:00'), and it is very common to want to push this predicate down into connectors for data pruning.