trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.37k stars 2.98k forks source link

Incorrect Filtering Behavior in MongoDB Connector #18130

Open pedromwurzel opened 1 year ago

pedromwurzel commented 1 year ago

It appears that there is an issue with the MongoDB connector. When I use the following query:

SELECT * FROM mongo_connector.mongo_schema.students
WHERE (genre IS NULL OR genre IN ('Female')) AND (class IS NULL OR class IN ('A'))

the search is ignoring the filtered genres and returning elements that have 'Male' as the genre. This behavior is not observed with the MySQL connector.

MongoDB connector: image

MySQL connector: image

To set up this scenario, I used the following commands:

CREATE SCHEMA mongo_connector.mongo_schema

CREATE TABLE mongo_connector.mongo_schema.students (
  id INT,
  genre VARCHAR,
  class VARCHAR
)
INSERT INTO mongo_connector.mongo_schema.students VALUES
(1, 'Male', 'A'),
(2, 'Female', 'B'),
(3, 'Female', NULL),
(4, NULL, 'B')

To create the same scenario in the MySQL connector I just changed the name of the catalog and schema. Please note that the issue arises specifically when executing this query in the MongoDB connector, while the MySQL connector works correctly. I noticed that this error started to exist from version 402 of Trino. Up to version 401, the MongoDB connector behaves correctly, the same as the MySQL connector.

ebyhr commented 1 year ago

The root cause is io.trino.plugin.mongodb.MongoSession#buildQuery. buildPredicate method generates two $or conditions and the 1st condition genre is overwritten by class.