citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.55k stars 668 forks source link

Query with distributed table & view is pushed down to worker nodes in Citus 9.4 #4392

Closed thanodnl closed 3 years ago

thanodnl commented 3 years ago

A combination of roles, CTE's, and potentially local execution cause a permission error when a view on a privileged table is queried in a CTE.

CREATE TABLE foo (a int, application_name text);
SELECT create_distributed_table('foo', 'a');
INSERT INTO foo VALUES (1, 'psql');

CREATE USER nils2;
-- +
GRANT USAGE on SCHEMA public TO nils2 ;
-- +
GRANT ALL ON foo TO nils2;

SET ROLE nils2;

WITH pg_stat_activity AS (
         SELECT pg_stat_activity.datid,
            pg_stat_activity.application_name,
            pg_stat_activity.query
           FROM pg_catalog.pg_stat_activity
        ) SELECT * FROM foo LEFT JOIN pg_stat_activity USING (application_name);

-- gives permission error
RESET ROLE; -- my role couldn't set the setting
SET citus.enable_cte_inlining TO false;
SET ROLE nils2;

WITH pg_stat_activity AS (
           SELECT pg_stat_activity.datid,
              pg_stat_activity.application_name,
              pg_stat_activity.query
             FROM pg_catalog.pg_stat_activity
          ) SELECT * FROM foo LEFT JOIN pg_stat_activity USING (application_name);
┌──────────────────┬──────┬───────┬──────────────────────────┐
│ application_name │  a   │ datid │          query           │
├──────────────────┼──────┼───────┼──────────────────────────┤
│ psql             │ 1337 │ 13722 │ <insufficient privilege> │
└──────────────────┴──────┴───────┴──────────────────────────┘
(1 row)

When looking at the beginning of an execution with the commands we run logged we find the following:

WITH pg_stat_activity AS (
         SELECT pg_stat_activity.datid,
            pg_stat_activity.application_name,
            pg_stat_activity.query
           FROM pg_catalog.pg_stat_activity
        ) SELECT * FROM foo LEFT JOIN pg_stat_activity USING (application_name);
NOTICE:  executing the command locally: SELECT s.datid, d.datname, s.pid, s.usesysid, u.rolname AS usename, s.application_name, s.client_addr, s.client_hostname, s.client_port, s.backend_start, s.xact_start, s.query_start, s.state_change, s.wait_event_type, s.wait_event, s.state, s.backend_xid, s.backend_xmin, s.query, s.backend_type FROM (((SELECT intermediate_result.datid, intermediate_result.pid, intermediate_result.usesysid, intermediate_result.application_name, intermediate_result.state, intermediate_result.query, intermediate_result.wait_event_type, intermediate_result.wait_event, intermediate_result.xact_start, intermediate_result.query_start, intermediate_result.backend_start, intermediate_result.state_change, intermediate_result.client_addr, intermediate_result.client_hostname, intermediate_result.client_port, intermediate_result.backend_xid, intermediate_result.backend_xmin, intermediate_result.backend_type, intermediate_result.ssl, intermediate_result.sslversion, intermediate_result.sslcipher, intermediate_result.sslbits, intermediate_result.sslcompression, intermediate_result.ssl_client_dn, intermediate_result.ssl_client_serial, intermediate_result.ssl_issuer_dn, intermediate_result.gss_auth, intermediate_result.gss_princ, intermediate_result.gss_enc FROM read_intermediate_result('11_1'::text, 'binary'::citus_copy_format) intermediate_result(datid oid, pid integer, usesysid oid, application_name text, state text, query text, wait_event_type text, wait_event text, xact_start timestamp with time zone, query_start timestamp with time zone, backend_start timestamp with time zone, state_change timestamp with time zone, client_addr inet, client_hostname text, client_port integer, backend_xid xid, backend_xmin xid, backend_type text, ssl boolean, sslversion text, sslcipher text, sslbits integer, sslcompression boolean, ssl_client_dn text, ssl_client_serial numeric, ssl_issuer_dn text, gss_auth boolean, gss_princ text, gss_enc boolean)) s LEFT JOIN pg_database d ON ((s.datid OPERATOR(pg_catalog.=) d.oid))) LEFT JOIN pg_authid u ON ((s.usesysid OPERATOR(pg_catalog.=) u.oid)))

Here we see that the view has been unrolled. This works when running as the root user, not as any other user. With cte inlining disabled the execution doesn't show the local execution of the CTE part. I guess it is not run as a command via the executor, but instead is evaluated in place.

thanodnl commented 3 years ago

The user was actually joining against a reference table instead of a distributed table. Disabling the GUC didn't solve it for them.

To reproduce I used the following file:

$ cat psa.sql
CREATE TABLE foo (a int, application_name text);
SELECT create_reference_table('foo');
EXPLAIN WITH psa AS (
SELECT pg_stat_activity.datid,
            pg_stat_activity.application_name,
            pg_stat_activity.query
           FROM pg_catalog.pg_stat_activity
) SELECT * FROM foo JOIN psa USING (application_name);

Running against a cleanly provisioned database:

psql -p9700 -a -f psa.sql

There are 2 explain outputs that you will get v9.3.5:

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                          QUERY PLAN                                                          │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)                                                               │
│   ->  Distributed Subplan 1_1                                                                                                │
│         ->  Function Scan on pg_stat_get_activity s  (cost=0.00..1.00 rows=100 width=68)                                     │
│   Task Count: 1                                                                                                              │
│   Tasks Shown: All                                                                                                           │
│   ->  Task                                                                                                                   │
│         Node: host=localhost port=9801 dbname=postgres                                                                       │
│         ->  Merge Join  (cost=148.00..248.25 rows=6350 width=72)                                                             │
│               Merge Cond: (intermediate_result.application_name = foo.application_name)                                      │
│               ->  Sort  (cost=59.83..62.33 rows=1000 width=68)                                                               │
│                     Sort Key: intermediate_result.application_name                                                           │
│                     ->  Function Scan on read_intermediate_result intermediate_result  (cost=0.00..10.00 rows=1000 width=68) │
│               ->  Sort  (cost=88.17..91.35 rows=1270 width=36)                                                               │
│                     Sort Key: foo.application_name                                                                           │
│                     ->  Seq Scan on foo_102008 foo  (cost=0.00..22.70 rows=1270 width=36)                                    │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

v9.4.3:

┌──────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                              QUERY PLAN                                              │
├──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)                                       │
│   Task Count: 1                                                                                      │
│   Tasks Shown: All                                                                                   │
│   ->  Task                                                                                           │
│         Node: host=localhost port=9801 dbname=postgres                                               │
│         ->  Hash Join  (cost=2.25..36.06 rows=635 width=72)                                          │
│               Hash Cond: (foo.application_name = s.application_name)                                 │
│               ->  Seq Scan on foo_102008 foo  (cost=0.00..22.70 rows=1270 width=36)                  │
│               ->  Hash  (cost=1.00..1.00 rows=100 width=72)                                          │
│                     ->  Function Scan on pg_stat_get_activity s  (cost=0.00..1.00 rows=100 width=72) │
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘

Git bisect point to https://github.com/citusdata/citus/commit/24feadc23013016450a731614c15a970b777b39d which is introduced with https://github.com/citusdata/citus/pull/3887

marcocitus commented 3 years ago

The root of the issue is that we don't have general support for view permissions. There are apparently some ways to read from a view without having permission to read from a table, namely when the view is encapsulated in a (materialized, recursively planned) CTE and only involves local tables and is part of a query with distributed tables.

PG12 inlines the CTE which also transforms it into a subquery. After inlining, the view is unwrapped and permissions are required, even if we recursively plan the query. While the latter leads to the same plan as a materialized CTE, it requires table read permission.

In <=9.3, CTE inlining tripped over the following error:

DEBUG:  Planning after CTEs inlined failed with                                                                                                                                                
message: relation pg_database is not distributed                                                                                                                                               
detail:                                                                                                                                                                                        
hint:      

Due to that error, the inlining was effectively undone pre-9.4, returning to the previous materialized CTE state, in which the query was supported.

onurctirtir commented 3 years ago

For that cte query, postgres doesn't give an error, instead, it behaves like:

WITH pg_stat_activity AS (
           SELECT pg_stat_activity.datid,
              pg_stat_activity.application_name,
              pg_stat_activity.query
             FROM pg_catalog.pg_stat_activity
          ) SELECT * FROM foo LEFT JOIN pg_stat_activity USING (application_name);
┌──────────────────┬──────┬───────┬──────────────────────────┐
│ application_name │  a   │ datid │          query           │
├──────────────────┼──────┼───────┼──────────────────────────┤
│ psql             │ 1337 │ 13722 │ <insufficient privilege> │
└──────────────────┴──────┴───────┴──────────────────────────┘
(1 row)

~So giving permission denied error is the wrong behavior.~ edit: Giving permission denied error is the expected behavior for citus.~


I can reproduce this error, which is the expected behavior, on master, v9.5.0 & v9.4.0 but not on v9.3.0

I will use git-bisect to see which commit fixed this behavior

onderkalaci commented 3 years ago

After inlining, the view is unwrapped and permissions are required, even if we recursively plan the query.

This is the unclear part to me. CTE inlining just copies the CTE->subquery to a regular subquery in the from clause. Same for both PG and Citus. Why is the view unwrapped after inlining? I guess I'm missing something here, need to debug a little further

thanodnl commented 3 years ago

@onurctirtir

I will use git-bisect to see which commit fixed this behavior

I ran git bisect, linked the commit above: https://github.com/citusdata/citus/commit/24feadc23013016450a731614c15a970b777b39d

thanodnl commented 3 years ago

Also, in production we solved it by adding random() to the target list of the CTE. This is confirmed by the user it worked for them.

onurctirtir commented 3 years ago

@onurctirtir

I will use git-bisect to see which commit fixed this behavior

I ran git bisect, linked the commit above: 24feadc

Yeah I've seen that, but in that commit and in the previous commit of it, we still throw the permission error, so I couldn't be sure if it's the fixing commit

thanodnl commented 3 years ago

we still throw the permission error

I think the title of the issue might need revision after newer insights, not to say permissions on views but not on tables might be an issue.

The root cause of the customer has not directly todo with permissions. More with the fact that the query against the pg_stat_activity view got pushed down to the worker. In the process of pushing down the query on the view it unrolled the query -which I think is expected- causing the permission on pg_authid.

From a distributed planners perspective It is a coincidence the is a pg_authid table is there, and the error comes from the fact that the user can't query it. This made us catch the fault to be fair, but is a red-herring for the underlying problem of pushing down the query on non-distributed tables.

marcocitus commented 3 years ago

More with the fact that the query against the pg_stat_activity view got pushed down to the worker.

note that this only happens on 9.4, not on 9.5 / master.

master correctly does recursive planning:

WITH pg_stat_activity AS (                                                                                                                                                          
         SELECT pg_stat_activity.datid,                                                                                                                                                        
            pg_stat_activity.application_name,                                                                                                                                                 
            pg_stat_activity.query                                                                                                                                                             
           FROM pg_catalog.pg_stat_activity                                                                                                                                                    
        ) SELECT * FROM ref r LEFT JOIN pg_stat_activity ON (r.a = pg_stat_activity.datid);                                                                                                    
NOTICE:  executing the command locally: SELECT s.datid, d.datname, s.pid, s.usesysid, u.rolname AS usename, s.application_name, s.client_addr, s.client_hostname, s.client_port, s.backend_star
t, s.xact_start, s.query_start, s.state_change, s.wait_event_type, s.wait_event, s.state, s.backend_xid, s.backend_xmin, s.query, s.backend_type FROM (((SELECT intermediate_result.datid, inte
rmediate_result.pid, intermediate_result.usesysid, intermediate_result.application_name, intermediate_result.state, intermediate_result.query, intermediate_result.wait_event_type, intermediat
e_result.wait_event, intermediate_result.xact_start, intermediate_result.query_start, intermediate_result.backend_start, intermediate_result.state_change, intermediate_result.client_addr, int
ermediate_result.client_hostname, intermediate_result.client_port, intermediate_result.backend_xid, intermediate_result.backend_xmin, intermediate_result.backend_type, intermediate_result.ssl
, intermediate_result.sslversion, intermediate_result.sslcipher, intermediate_result.sslbits, intermediate_result.sslcompression, intermediate_result.ssl_client_dn, intermediate_result.ssl_cl
ient_serial, intermediate_result.ssl_issuer_dn, intermediate_result.gss_auth, intermediate_result.gss_princ, intermediate_result.gss_enc FROM read_intermediate_result('2_1'::text, 'binary'::c
itus_copy_format) intermediate_result(datid oid, pid integer, usesysid oid, application_name text, state text, query text, wait_event_type text, wait_event text, xact_start timestamp with tim
e zone, query_start timestamp with time zone, backend_start timestamp with time zone, state_change timestamp with time zone, client_addr inet, client_hostname text, client_port integer, backe
nd_xid xid, backend_xmin xid, backend_type text, ssl boolean, sslversion text, sslcipher text, sslbits integer, sslcompression boolean, ssl_client_dn text, ssl_client_serial numeric, ssl_issu
er_dn text, gss_auth boolean, gss_princ text, gss_enc boolean)) s LEFT JOIN pg_database d ON ((s.datid OPERATOR(pg_catalog.=) d.oid))) LEFT JOIN pg_authid u ON ((s.usesysid OPERATOR(pg_catalo
g.=) u.oid)))                                                                                                                                                                                  
NOTICE:  issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(0, 13, '2020-12-06 14:59:28.032876+01');
DETAIL:  on server marco@localhost:1201 connectionId: 1
NOTICE:  issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(0, 13, '2020-12-06 14:59:28.032876+01');
DETAIL:  on server marco@localhost:1202 connectionId: 2
NOTICE:  issuing COPY "2_2" FROM STDIN WITH (format result)
DETAIL:  on server marco@localhost:1201 connectionId: 1
NOTICE:  issuing COPY "2_2" FROM STDIN WITH (format result)
DETAIL:  on server marco@localhost:1202 connectionId: 2
NOTICE:  issuing SELECT r.a, r.b, pg_stat_activity.datid, pg_stat_activity.application_name, pg_stat_activity.query FROM (public.ref_102040 r LEFT JOIN (SELECT pg_stat_activity_1.datid, pg_st
at_activity_1.application_name, pg_stat_activity_1.query FROM (SELECT intermediate_result.datid, intermediate_result.datname, intermediate_result.pid, intermediate_result.usesysid, intermedia
te_result.usename, intermediate_result.application_name, intermediate_result.client_addr, intermediate_result.client_hostname, intermediate_result.client_port, intermediate_result.backend_sta
rt, intermediate_result.xact_start, intermediate_result.query_start, intermediate_result.state_change, intermediate_result.wait_event_type, intermediate_result.wait_event, intermediate_result
.state, intermediate_result.backend_xid, intermediate_result.backend_xmin, intermediate_result.query, intermediate_result.backend_type FROM read_intermediate_result('2_2'::text, 'binary'::cit
us_copy_format) intermediate_result(datid oid, datname name, pid integer, usesysid oid, usename name, application_name text, client_addr inet, client_hostname text, client_port integer, backe
nd_start timestamp with time zone, xact_start timestamp with time zone, query_start timestamp with time zone, state_change timestamp with time zone, wait_event_type text, wait_event text, sta
te text, backend_xid xid, backend_xmin xid, query text, backend_type text)) pg_stat_activity_1) pg_stat_activity ON (((r.a)::oid OPERATOR(pg_catalog.=) pg_stat_activity.datid)))
DETAIL:  on server marco@localhost:1201 connectionId: 1
NOTICE:  issuing COMMIT
DETAIL:  on server marco@localhost:1201 connectionId: 1
NOTICE:  issuing COMMIT
DETAIL:  on server marco@localhost:1202 connectionId: 2

9.4 incorrectly pushes down:

WITH pg_stat_activity AS (                                                                                                                                                          
         SELECT pg_stat_activity.datid,                                                                                                                                                        
            pg_stat_activity.application_name,                                                                                                                                                 
            pg_stat_activity.query                                                                                                                                                             
           FROM pg_catalog.pg_stat_activity                                                                                                                                                    
        ) SELECT * FROM ref r LEFT JOIN pg_stat_activity ON (r.a = pg_stat_activity.datid);    
NOTICE:  issuing SELECT r.a, r.b, pg_stat_activity.datid, pg_stat_activity.application_name, pg_stat_activity.query FROM (public.ref_102040 r LEFT JOIN (SELECT pg_stat_activity_1.datid, pg_stat_activity_1.application_name, pg_stat_activity_1.query FROM (SELECT s.datid, d.datname, s.pid, s.usesysid, u.rolname AS usename, s.application_name, s.client_addr, s.client_hostname, s.client_port, s.backend_start, s.xact_start, s.query_start, s.state_change, s.wait_event_type, s.wait_event, s.state, s.backend_xid, s.backend_xmin, s.query, s.backend_type FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc) LEFT JOIN pg_database d ON ((s.datid OPERATOR(pg_catalog.=) d.oid))) LEFT JOIN pg_authid u ON ((s.usesysid OPERATOR(pg_catalog.=) u.oid)))) pg_stat_activity_1) pg_stat_activity ON (((r.a)::oid OPERATOR(pg_catalog.=) pg_stat_activity.datid)))
DETAIL:  on server marco@localhost:1201 connectionId: 1
onurctirtir commented 3 years ago

I verified that it is not fixed by citus local tables planner changes.

Then it is certain that fixing commit is between 5e5ba467932754b75a0faf241b72977bab5adee0 & d3019f1b6d153177cc9d64fa4b8190a655e7bc57,

where d3019f1b6d153177cc9d64fa4b8190a655e7bc57 is the output of git merge-base v9.5.0 master

onurctirtir commented 3 years ago

Doing git-bisect between those two commits that I mentioned in above comment, I found the fixing commit finally, which is e0d2ac76208aef6ae0be3bb38adc23016f889b3d

onurctirtir commented 3 years ago

Before e0d2ac76208aef6ae0be3bb38adc23016f889b3d, say 05569526077a45d170a96500446c743c87d4cea2, we were relying on multi_relation_restriction_hook to find local relations. However, multi_relation_restriction_hook was even not executed for pg_stat_activity (not sure why), and RelationRestrictionContext->hasLocalRelation becomes false and we were doing push down execution.

However in one commit after (e0d2ac76208aef6ae0be3bb38adc23016f889b3d, which is the fixing commit), as we don't rely on multi_relation_restriction_hook anymore, we were able to catch pg_stat_activity view as a local relation so were not doing push down execution.

See https://github.com/citusdata/citus/commit/e0d2ac76208aef6ae0be3bb38adc23016f889b3d#diff-3a10926b5b6507fef6f9e9fe86a6a95220a5775ae4529358ebd49a879c4a0383R2268-R2271


As a result, in 05569526077a45d170a96500446c743c87d4cea2, we see:

EXPLAIN WITH psa AS (
SELECT pg_stat_activity.datid,
            pg_stat_activity.application_name,
            pg_stat_activity.query
           FROM pg_catalog.pg_stat_activity
) SELECT * FROM foo JOIN psa USING (application_name);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                              QUERY PLAN                                              │
├──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)                                       │
│   Task Count: 1                                                                                      │
│   Tasks Shown: All                                                                                   │
│   ->  Task                                                                                           │
│         Node: host=localhost port=9701 dbname=postgres                                               │
│         ->  Hash Join  (cost=2.25..36.06 rows=635 width=72)                                          │
│               Hash Cond: (foo.application_name = s.application_name)                                 │
│               ->  Seq Scan on foo_102008 foo  (cost=0.00..22.70 rows=1270 width=36)                  │
│               ->  Hash  (cost=1.00..1.00 rows=100 width=72)                                          │
│                     ->  Function Scan on pg_stat_get_activity s  (cost=0.00..1.00 rows=100 width=72) │
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
(10 rows)

Time: 2307.166 ms (00:02.307)

and in e0d2ac76208aef6ae0be3bb38adc23016f889b3d, we see:

EXPLAIN WITH psa AS (
SELECT pg_stat_activity.datid,
            pg_stat_activity.application_name,
            pg_stat_activity.query
           FROM pg_catalog.pg_stat_activity
) SELECT * FROM foo JOIN psa USING (application_name);
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                QUERY PLAN                                                                 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)                                                                            │
│   ->  Distributed Subplan 2_1                                                                                                             │
│         ->  Function Scan on pg_stat_get_activity s  (cost=0.00..1.00 rows=100 width=512)                                                 │
│   ->  Distributed Subplan 2_2                                                                                                             │
│         ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)                                                                │
│               Task Count: 1                                                                                                               │
│               Tasks Shown: All                                                                                                            │
│               ->  Task                                                                                                                    │
│                     Node: host=localhost port=9700 dbname=postgres                                                                        │
│                     ->  Hash Left Join  (cost=2.25..17.61 rows=1000 width=440)                                                            │
│                           Hash Cond: (intermediate_result.usesysid = u.oid)                                                               │
│                           ->  Hash Left Join  (cost=1.05..13.73 rows=1000 width=376)                                                      │
│                                 Hash Cond: (intermediate_result.datid = d.oid)                                                            │
│                                 ->  Function Scan on read_intermediate_result intermediate_result  (cost=0.00..10.00 rows=1000 width=312) │
│                                 ->  Hash  (cost=1.02..1.02 rows=2 width=68)                                                               │
│                                       ->  Seq Scan on pg_database d  (cost=0.00..1.02 rows=2 width=68)                                    │
│                           ->  Hash  (cost=1.09..1.09 rows=9 width=68)                                                                     │
│                                 ->  Seq Scan on pg_authid u  (cost=0.00..1.09 rows=9 width=68)                                            │
│   Task Count: 1                                                                                                                           │
│   Tasks Shown: All                                                                                                                        │
│   ->  Task                                                                                                                                │
│         Node: host=localhost port=9701 dbname=postgres                                                                                    │
│         ->  Merge Join  (cost=148.00..248.25 rows=6350 width=72)                                                                          │
│               Merge Cond: (intermediate_result.application_name = foo.application_name)                                                   │
│               ->  Sort  (cost=59.83..62.33 rows=1000 width=68)                                                                            │
│                     Sort Key: intermediate_result.application_name                                                                        │
│                     ->  Function Scan on read_intermediate_result intermediate_result  (cost=0.00..10.00 rows=1000 width=68)              │
│               ->  Sort  (cost=88.17..91.35 rows=1270 width=36)                                                                            │
│                     Sort Key: foo.application_name                                                                                        │
│                     ->  Seq Scan on foo_102008 foo  (cost=0.00..22.70 rows=1270 width=36)                                                 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(30 rows)

Time: 22.704 ms

Now I'm trying to find some more test cases for e0d2ac76208aef6ae0be3bb38adc23016f889b3d (https://github.com/citusdata/citus/pull/4292) other than above.

For example, I tried:

CREATE TABLE local_table_1 (a int, b int);
CREATE TABLE local_table_2 (a int, b int);

CREATE VIEW local_tables_view AS
SELECT local_table_1.a AS a, local_table_1.b AS b
FROM local_table_1
LEFT JOIN local_table_2 AS l2_1 USING (a)
LEFT JOIN local_table_2 AS l2_2 USING (a);

EXPLAIN WITH cte AS (
  SELECT local_tables_view.a, local_tables_view.b
  FROM local_tables_view
) SELECT * FROM reference_table JOIN cte USING (a);

However, it was surprisingly not pushed down to the worker nodes even before e0d2ac76208aef6ae0be3bb38adc23016f889b3d as multi_relation_restriction_hook is called by postgres for our view in that case and we capture it as a local relation.

Now I'm trying to understand what is the the difference between pg_stat_activity & local_tables_view from postgres perspective when deciding whether to call multi_relation_restriction_hook for view

(cc: @thanodnl, @onderkalaci, @marcocitus)

onurctirtir commented 3 years ago

It is obvious that #4292 fixes below:

EXPLAIN WITH psa AS (
SELECT pg_stat_activity.datid,
            pg_stat_activity.application_name,
            pg_stat_activity.query
           FROM pg_catalog.pg_stat_activity
) SELECT * FROM foo JOIN psa USING (application_name);

But we might still try to push down some queries with views.

When extracting rte's from views to find local relations in GetRTEListPropertiesForQuery, we assume that we will find some local tables in view rte.. However, some views may not even include any tables

+ See https://github.com/citusdata/citus/issues/4405


In below two queries, we still try to push the execution down to the workers, on master:

Example - 1

EXPLAIN WITH cte AS (
  SELECT locktype AS text_col
  FROM pg_locks
) SELECT * FROM reference_table JOIN cte USING (text_col);
.....
NOTICE:  issuing EXPLAIN (ANALYZE FALSE, VERBOSE FALSE, COSTS TRUE, BUFFERS FALSE, TIMING FALSE, SUMMARY FALSE, FORMAT TEXT) WITH cte AS (SELECT pg_locks.locktype AS text_col FROM (SELECT l.locktype, l.database, l.relation, l.page, l.tuple, l.virtualxid, l.transactionid, l.classid, l.objid, l.objsubid, l.virtualtransaction, l.pid, l.mode, l.granted, l.fastpath FROM pg_lock_status() l(locktype, database, relation, page, tuple, virtualxid, transactionid, classid, objid, objsubid, virtualtransaction, pid, mode, granted, fastpath)) pg_locks) SELECT reference_table.text_col, reference_table.int_col FROM (public.reference_table_102008 reference_table(text_col, int_col) JOIN cte USING (text_col))
.....
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                    QUERY PLAN                                                     │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)                                                    │
│   Task Count: 1                                                                                                   │
│   Tasks Shown: All                                                                                                │
│   ->  Task                                                                                                        │
│         Node: host=localhost port=9701 dbname=postgres                                                            │
│         ->  Merge Join  (cost=168.00..268.25 rows=6350 width=36)                                                  │
│               Merge Cond: (cte.text_col = reference_table.text_col)                                               │
│               CTE cte                                                                                             │
│                 ->  Function Scan on pg_lock_status l  (cost=0.00..10.00 rows=1000 width=32)                      │
│               ->  Sort  (cost=69.83..72.33 rows=1000 width=32)                                                    │
│                     Sort Key: cte.text_col                                                                        │
│                     ->  CTE Scan on cte  (cost=0.00..20.00 rows=1000 width=32)                                    │
│               ->  Sort  (cost=88.17..91.35 rows=1270 width=36)                                                    │
│                     Sort Key: reference_table.text_col                                                            │
│                     ->  Seq Scan on reference_table_102008 reference_table  (cost=0.00..22.70 rows=1270 width=36) │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(15 rows)

Time: 1740.177 ms (00:01.740)

This is because, pg_locks doesn't refer to any tables but to pg_lock_status function:

View definition:
 SELECT l.locktype,
    l.database,
    .....
    l.granted,
    l.fastpath
   FROM pg_lock_status() l(locktype, database, relation, page, tuple, virtualxid, transactionid, classid, objid, objsubid, virtualtransaction, pid, mode, granted, fastpath);

Example - 2

CREATE FUNCTION add(integer, integer) RETURNS integer
    AS 'SELECT $1 + $2;'
    LANGUAGE SQL
    IMMUTABLE
    RETURNS NULL ON NULL INPUT;

CREATE OR REPLACE func_call_view as SELECT add(1,2)::text AS text_col;

EXPLAIN WITH cte AS (
  SELECT text_col
  FROM func_call_view
) SELECT * FROM reference_table JOIN cte USING (text_col);
┌────────────────────────────────────────────────────────────────┐
│                           QUERY PLAN                           │
├────────────────────────────────────────────────────────────────┤
│ Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0) │
│   Task Count: 1                                                │
│   Tasks Shown: All                                             │
│   ->  Task                                                     │
│         Error: Could not get remote plan.                      │
└────────────────────────────────────────────────────────────────┘
(5 rows)

Time: 9.549 ms
onurctirtir commented 3 years ago

We should backport https://github.com/citusdata/citus/pull/4292 to 9.4 but it wouldn't be trivial as that pr depends on GetRTEListProperties and GetRTEListProperties depends on https://github.com/citusdata/citus/pull/4132, which is slightly a major change.

onurctirtir commented 3 years ago

Fixed by https://github.com/citusdata/citus/pull/4418