What is the bug?
This is a bug with direct query + iceberg.
When adding a filter to a query based on a timestamp field compared to a datetime, the filter is only considering the date from the datetime.
Example:
SELECT * FROM validation.amazon_security_lake_glue_db_us_west_2.amazon_security_lake_table_us_west_2_route53_2_0 WHERE time_dt > '2024-08-09T00:00:00Z' ORDER BY time_dt ASC LIMIT 10
The above query is expected to return all results from 2024-08-09T00:00:01Z to the current time. Instead it only returns results for entries 2024-08-10T00:00:00Z to the current time.
Changing the operator from > to >= returns the expected results.
It looks like Iceberg/Spark are comparing only the dates from the datetime filter.
Explained query for the above example looks like it has converted the datetime to an epoch microsecond correctly, but the query results do not align with that.
What is the bug? This is a bug with direct query + iceberg.
When adding a filter to a query based on a
timestamp
field compared to a datetime, the filter is only considering the date from the datetime.Example:
The above query is expected to return all results from
2024-08-09T00:00:01Z
to the current time. Instead it only returns results for entries2024-08-10T00:00:00Z
to the current time.Changing the operator from
>
to>=
returns the expected results.It looks like Iceberg/Spark are comparing only the dates from the datetime filter.
Explained query for the above example looks like it has converted the datetime to an epoch microsecond correctly, but the query results do not align with that.
How can one reproduce the bug? Steps to reproduce the behavior:
What is the expected behavior? Filters based on datetimes should return results after the datetime rather than the date within the datetime.
What is your host/environment?
Do you have any screenshots? If applicable, add screenshots to help explain your problem.
Do you have any additional context? Add any other context about the problem.