citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.62k stars 670 forks source link

RLS Policies don't work if built on non-partitioned columns #7182

Open Xavyr opened 1 year ago

Xavyr commented 1 year ago

Issue All of our current RLS policies are based on checking against a partitioned column, however we now want to limit results via an RLS policy based around a non partitioned column. The RLS and queries work as expected in a standard postgres setup as well as within citus across distributed tables should the shard key be the data piece the RLS is built around. But as we try to write the RLS policy around a timestamp column/value that cannot be a shard key or partitioned column the RLS policy is not respected and all results are returned. Is this a limitation of citus?

Steps to Reproduce: Create within a citus environment with postgres a table with two columns. Make one column a partitioned column and fill with dummy data. Apply RLS policy and see the results are returned correctly. Write a new RLS policy based around values found in the non partitioned columns.

Results returned do not respect the limitations of the RLS policy. There is no error or anything that signals anything is amiss.

emelsimsek commented 1 year ago

I could not reproduce this. See my steps below. Policy p1 is defined on a non-distribution column of a distributed table. I can see that the policy is applied.

Is this similar to your scenario? Can you add your steps?

CREATE TABLE message(
   id int PRIMARY KEY, 
   timestamp timestamp with time zone NOT NULL DEFAULT now(), 
   from_user text NOT NULL, 
   to_user text NOT NULL, 
   message text NOT NULL);

ALTER TABLE message ENABLE ROW LEVEL SECURITY;

SELECT create_distributed_table('message', 'id');

CREATE POLICY p1 ON message USING 
(current_user IN (from_user, to_user)) WITH CHECK (current_user = from_user);

INSERT INTO message(id, from_user, to_user, message) VALUES (2, 'user1', 'admin', 'hello');                                                                                                              
INSERT INTO message(id, from_user, to_user, message) VALUES (3, 'user2', 'admin', 'hello');

SELECT * FROM message;                                             │
 id |           timestamp           | from_user | to_user | message       │
----+-------------------------------+-----------+---------+---------      │
  3 | 2023-09-06 15:10:05.753858+03 | user2     | admin   | hello         │
  2 | 2023-09-06 15:10:05.734524+03 | user1     | admin   | hello         │
(2 rows)

CREATE USER user1;
GRANT ALL PRIVILEGES on message TO user1;

SET ROLE user1;

SELECT * FROM message;                                             │
 id |           timestamp           | from_user | to_user | message       │
----+-------------------------------+-----------+---------+---------      │
  2 | 2023-09-06 15:10:05.734524+03 | user1     | admin   | hello         │
(1 row)
Xavyr commented 1 year ago

Hi @emelsimsek, thank you very kindly for responding! I hope your day is going well.

I'll be more specific with my RLS policy and code as I'm trying to assimilate what you have done and am still getting too many rows...

In an earlier migration we have select create_distributed_table('message', 'shardKeyColumn', colocate_with => 'otherTableName')

(I wonder if the colocation is at play here).

My most recent migration that proves troublesome includes all of this code:

ALTER TABLE "message" ENABLE ROW LEVEL SECURITY;

-- without this we have a permission denied error in hitting the table and we can't see it
GRANT ALL PRIVILEGES ON "message" to limited_user

DROP POLICY IF EXISTS grant_limited_user ON  "message";

-- create policy just on message opening the floodgates
        CREATE POLICY grant_limited_user
        ON "message"
        AS PERMISSIVE
        FOR ALL
        TO  limited_user
        USING (true)
        WITH CHECK (true)

 DROP POLICY IF EXISTS restrict_message  ON "message";

-- actual policy that is giving us trouble
CREATE POLICY restrict_message
ON message
AS RESTRICTIVE
FOR ALL
TO limited_user
USING ('2020-10-24 08:30:40' IN (timestamp)) -- where this value is actually in a cell in that "timestamp" column in "message"

I've tried a variety of RLS against other columns, checking text, integer and NOT NULL... I've also tried WITH CHECK and without the WITH CHECK as part of it.