Open software-opal opened 2 years ago
The aggregation is not necessary for reproducing the issue. We can reproduce by:
SELECT 1 FROM cassandra.prod.bug_reproduction_table WHERE id = '1' AND trino_filter_col = 0
Hey, thanks for the PR to fix the issue highlighted in the report; It seems that there is still an issue with the Cassandra driver related to this.
Specifically, when adding the following row to the Cassandra table, the issue re-appears.
INSERT INTO prod.bug_reproduction_table(id, trino_filter_col) VALUES ('1', 0);
Outputting:
$ trino -e "SELECT COUNT(*) FROM cassandra.prod.bug_reproduction_table WHERE id = '1'"
1
$ trino -e "SELECT COUNT(*) FROM cassandra.prod.bug_reproduction_table WHERE id = '1' AND trino_filter_col = 0"
2
Happy to file a new ticket if that'd make tracking the issue a bit easier.
Sorry for not discovering this earlier, I'm not set up to build Trino from source and so waited until the 372 release to test the fix with our systems. I'll endeavour to build Trino from source and verify future PRs
We noticed an issue with the filtering of Scylla backed queries during an upgrade from version 353 to 354 where Trino would appear to not perform filtering leading to unusual aggregation results. I have spent some time investigating this issue and have been able to create a minimal set of steps and data required to reproduce the issue. This issue is still present in version 370(for which the reproduction steps have been written).
I have determined that the Cassandra driver is incorrectly filtering the results(or incorrectly pushing down filtering into Cassandra) when a query filters on both a key and another column, and only when the returned columns are used as part of an aggregation.
These are the two queries which when run against the minimal reproduction setup illustrate the issue:
Looking at the Trino logs with the Cassandra plugin's log level set to debug, we can see that:
The first query correctly pushes the
id
down into Cassandra(as seen by the debug logs for the Cassandra connector) by running this query in Cassandra:However the second query does not push the
id
into Cassandra and instead fetches all rows from Cassandra, but it does not apply theid
filter within Trino to compensate:Detailed reproduction steps
In a Cassandra CQLSH run the following commands to seed data: ```sql CREATE KEYSPACE prod WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor' : '1'}; CREATE TABLE prod.bug_reproduction_table ( id varchar, trino_filter_col int, PRIMARY KEY (id) ); INSERT INTO prod.bug_reproduction_table(id, trino_filter_col) VALUES ('2', 0); ``` Note that the `id` is `'2'`, which does not match the filter above and the `trino_filter_col` is `0` which does match the filter above. Now execute the queries shown above replacing `cassandra` with the name of your connector. For my testing I used a docker compose file: ```yaml version: '3' services: trino: image: trinodb/trino:370 ports: - "127.0.0.1:8080:8080" volumes: - ./catalog/:/etc/trino/catalog/:ro - ./log.properties:/etc/trino/log.properties:ro depends_on: - cassandra cassandra: image: cassandra:4 ``` And a `cassandra` catalog: ```properties connector.name=cassandra cassandra.contact-points=cassandra ```I should note that as part of investigating this issue, we initially thought it required the use of Scylla, a materialised view, the inclusion of a 'range' column(like a date-time range) however these are not necessary and the above example is about as small as I can get it.
As noted by ebyhr, the aggregation is not necessary. It is possible to reproduce the issue by replacing
SELECT COUNT(*)
withSELECT 1
orSELECT trino_filter_col
. However replacingSELECT COUNT(*)
withSELECT *
orSELECT id
causes the issue to disappear.