EnterpriseDB / mongo_fdw

PostgreSQL foreign data wrapper for MongoDB
GNU Lesser General Public License v3.0
326 stars 70 forks source link

Filtering Condition doesn't push down #58

Open ljy013 opened 8 years ago

ljy013 commented 8 years ago

I'm testing with mongo_fdw on PG9.5. And I experienced issue. It seems that Filetering condidion doesn't push down.

Firstly, I queried on mongodb shell like below. And it looks filter correctly.

db.test2.find({code:"A", ymd:"20150101"})

2016-07-22T14:58:37.528+0900 D SHARDING [conn27] command: fdw02.$cmd { find: "test2", filter: { code: "A", ymd: "20150101" } } ntoreturn: 1 options: 0 2016-07-22T14:58:37.529+0900 D EXECUTOR [conn27] Scheduling remote command request: RemoteCommand 1789626 -- target:...: db:fdw02 cmd:{ find: "test2", filter: { code: "A", acrs_ymd: "20150101" }, shardVersion: [ Timestamp 0|0, ObjectId('000000000000000000000000') ] }

But When I queried on PG through FDW, there's no filter condition.

select * from tbm_prgs_fdw_temp where code = 'A' and ymd = '20150101';

2016-07-22T14:58:52.761+0900 D SHARDING [conn29] command: fdw02.$cmd { count: "test2" } ntoreturn: 1 options: 0 2016-07-22T14:58:52.761+0900 D NETWORK [conn29] creating pcursor over QSpec { ns: "fdw02.$cmd", n2skip: 0, n2return: 1, options: 0, query: { count: "test2" }, fields: {} } and CInfo { v_ns: "fdw02.test2", filter: {} }

ahsanhadi commented 8 years ago

Can you please share the explain plan of the query you are executing using mongodb_fdw...?

lifubang commented 8 years ago

mongo_fdw need two count commands for full table and one isMaster command before you exec a select sql with filter.

chinhngt commented 7 years ago

It looks like filtering is not being pushed down for me too:

explain (analyze) select * from events where event_id = '00002950-55e7-4b16-aa82-cb86f7a6aa31';

Foreign Scan on events (cost=0.00..0.00 rows=1000 width=1508) (actual time=0.288..3628.348 rows=11 loops=1) Filter: ((event_id)::text = '00002950-55e7-4b16-aa82-cb86f7a6aa31'::text) Rows Removed by Filter: 2455023 Foreign Namespace: db.events Planning time: 1.033 ms Execution time: 3628.448 ms (6 rows)

Any idea @lifubang @ahsanhadi? I have postgresql-9.6.3 and mongodb 3.4.6

lifubang commented 7 years ago

@chinhngt I think the event_id is not ObjectId type. If you use ObjectId type column in filter, It will be improved(in PG, the type is "NAME"). The string type in where clause is very slow when use mongo_fdw. If you have no ObjectId type column, you can use numeric type column. For example, you can add a column id, the type of it is long.

ahsanhadi commented 7 years ago

Hi,

I ran a query on the sample warehouse table

"select * from warehouse where warehouse_name='UPS';"

and i can see the filter condition in the mongodb logs. It seems that explain analyze not working with mongdb_fdw but the where clause is getting pushed down and it can checked in the log...

Can you also check your mongo log...

-- Ahsan

vaibhavdalvi93 commented 2 years ago

Hi @lifubang,

select * from tbm_prgs_fdw_temp where code = 'A' and ymd = '20150101';

Currently, this condition in WHERE clause is not going to pushdown because boolean expression or nested operator expression not yet supported. We're currently working on supporting pushdown of nested operator expression and boolean expression as well in WHERE condition.

esatterwhite commented 1 year ago

What is the status of this. Was this ever addressed?

vaibhavdalvi93 commented 1 year ago

@esatterwhite, support of pushing down boolean expression or nested operator expression in WHERE has been already added in the latest version of mongo_fdw i.e. 5.5.0. Please use the latest version of mongo_fdw and check. Kindly, let us know your valuable feedback.

esatterwhite commented 1 year ago

@vaibhavdalvi93 mostly unrelated, but the fdw always wants to connect to localhost.


CREATE SERVER mongo_server                                                                        
        FOREIGN DATA WRAPPER mongo_fdw
        OPTIONS (address 'mongo', port '27017', authentication_database 'admin', replica_set 'test-mongo-repl', read_preference 'secondaryPreferred');

CREATE USER MAPPING for CURRENT_USER SERVER mongo_server OPTIONS(username 'admin', password 'password');

CREATE FOREIGN TABLE mg_accounts (_id name, account text, billing json, billingemail text, company text, key text[]) SERVER mongo_server OPTIONS(database 'db', collection 'accounts');

select * from mg_accounts;
\des+
                                                                                                          List of foreign servers
     Name     |  Owner   | Foreign-data wrapper | Access privileges | Type | Version |                                                              FDW options                                                              | Description 
--------------+----------+----------------------+-------------------+------+---------+---------------------------------------------------------------------------------------------------------------------------------------+-------------
 mongo_server | postgres | mongo_fdw            |                   |      |         | (address 'mongo', port '27017', authentication_database 'admin', replica_set 'test-mongo-repl', read_preference 'secondaryPreferred') | 

ERROR:  could not connect to server mongo_server
HINT:  Mongo error: "No suitable servers found (`serverSelectionTryOnce` set): [connection refused calling ismaster on 'localhost:27017']"
esatterwhite commented 1 year ago

Oh, interesting, If I alter the port to non-default, it tries the server, if i set it back to the default port, it goes back to the default host


logdna=# alter SERVER mongo_server OPTIONS(SET port '27018');
ALTER SERVER
db=# \des+
                                                                                                          List of foreign servers
     Name     |  Owner   | Foreign-data wrapper | Access privileges | Type | Version |                                                              FDW options                                                              | Description 
--------------+----------+----------------------+-------------------+------+---------+---------------------------------------------------------------------------------------------------------------------------------------+-------------
 mongo_server | postgres | mongo_fdw            |                   |      |         | (address 'mongo', port '27018', authentication_database 'admin', replica_set 'test-mongo-repl', read_preference 'secondaryPreferred') | 
(1 row)

db=# select * from mg_accounts;
ERROR:  could not connect to server mongo_server
HINT:  Mongo error: "No suitable servers found (`serverSelectionTryOnce` set): [connection refused calling ismaster on 'mongo:27018']"

db=# alter SERVER mongo_server OPTIONS(SET port '27017');
ALTER SERVER

db=# select * from mg_accounts;
ERROR:  could not connect to server mongo_server
HINT:  Mongo error: "No suitable servers found (`serverSelectionTryOnce` set): [connection refused calling ismaster on 'localhost:27017']"
esatterwhite commented 1 year ago

I figured out the problem, nevermind me 😆

simon-wolf commented 1 year ago

@esatterwhite, support of pushing down boolean expression or nested operator expression in WHERE has been already added in the latest version of mongo_fdw i.e. 5.5.0. Please use the latest version of mongo_fdw and check. Kindly, let us know your valuable feedback.

@vaibhavdalvi93 I think I'm still seeing a problem with booleans.

Using pgcli, I'm showing an extension version of 1.1 and a mongo_fdw_version of 50500 which I think is the latest version:

dev_v3_platform> \dx
+-----------+---------+------------+------------------------------------------------------------+
| Name      | Version | Schema     | Description                                                |
|-----------+---------+------------+------------------------------------------------------------|
| citext    | 1.6     | public     | data type for case-insensitive character strings           |
| mongo_fdw | 1.1     | public     | foreign data wrapper for MongoDB access                    |
...

dev_v3_platform> select mongo_fdw_version();
+-------------------+
| mongo_fdw_version |
|-------------------|
| 50500             |
+-------------------+

If I run the following query I will get back records where active is true:

dev_v3_platform> SELECT "customerId", "active" FROM mongo_devices_prod WHERE "active" = false LIMIT 5;
+-------------------+--------+
| customerId        | active |
|-------------------+--------|
| x8t8K4NpvqBQftMKo | True   |
| gaQ3wAmk8zaJ3dFa5 | True   |
| x8t8K4NpvqBQftMKo | True   |
| x8t8K4NpvqBQftMKo | True   |
| gaQ3wAmk8zaJ3dFa5 | True   |
+-------------------+--------+

I can get it to work if I change the WHERE clause to this:

dev_v3_platform> SELECT "customerId", "active" FROM mongo_devices_prod WHERE "active" is false LIMIT 5;
+-------------------+--------+
| customerId        | active |
|-------------------+--------|
| mariCyaD5YzAvF2rj | False  |
| pTTcPKDG4Zzrx2org | False  |
| mariCyaD5YzAvF2rj | False  |
| gaQ3wAmk8zaJ3dFa5 | False  |
| gaQ3wAmk8zaJ3dFa5 | False  |
+-------------------+--------+

I'm assuming that is not the expected situation.

vaibhavdalvi93 commented 1 year ago

Thanks, @simon-wolf for reporting an issue.

This is a bug. The boolean expression in WHERE clause is getting push-down but wrong remote query i.e MongoDB query document is getting formed and this is the reason for wrong output.

I could solve this issue. If possible, you can use attached patch to solve this issue. Please share your valuable feedback on this. v1-Fix-bool-expr-in-WHERE.txt

We will open internal issue and try to commit it in near future.

Thanks, Vaibhav

simon-wolf commented 1 year ago

Thank you @vaibhavdalvi93. I'm using the extension via CruncyBridge but I will update my support ticket with them to see if they can test it. If not then my apologies and hopefully they will update to the version containing the eventual fix once it has been released.

Thank you for your help with this and for all of you hard work on the extension.

simon-wolf commented 1 year ago

@vaibhavdalvi93 CrunchyBridge installed the patch on my instance and I can confirm that it resolved the issue:

dev_v3_platform> SELECT "customerId", "active" FROM mongo_devices_prod WHERE "active" = false LIMIT 5;
+-------------------+--------+
| customerId        | active |
|-------------------+--------|
| mariCyaD5YzAvF2rj | False  |
| pTTcPKDG4Zzrx2org | False  |
| mariCyaD5YzAvF2rj | False  |
| gaQ3wAmk8zaJ3dFa5 | False  |
| gaQ3wAmk8zaJ3dFa5 | False  |
+-------------------+--------+
vaibhavdalvi93 commented 1 year ago

Thanks, @simon-wolf for acknowledgement and glad to hear that your issue got resolved.