Open steve-chavez opened 1 year ago
Hello Steve,Whan about index on uuid column?
This is certainly not because of LEAKPROOF or RLS. You can see it easily, by manually executing the constructed query:
SELECT COUNT(*) FROM chat_messages WHERE workspace = (app_metadata() ->> 'workspace')::uuid;
This is equally slow - even without RLS.
LEAKPROOF
is about pushing conditions through RLS, i.e. when you do a query SELECT ... FROM rls_enabled_table WHERE some_func_used_here(col)
. In this case some_func_used_here must be defined LEAKPROOF
to make it run on the base query directly - and not on the result after applying RLS.
But in this case, your function is inside the RLS policy. If you're leaking data there, that's all your own responsibility.
Whan about index on uuid column?
Yes, this is the right solution.
But in this case, your function is inside the RLS policy. If you're leaking data there, that's all your own responsibility.
We encourage using current_setting
on RLS on the docs , that has been there for quite a while. All the HTTP context GUCs could be used inside RLS, and we can't control the wrapper functions users put over them.
Hello Steve,Whan about index on uuid column?
Thanks dwagin, yeah that works. But the thing is, we didn't have this issue with the legacy GUCs that were available before v9.0.
So in the above example, by doing this:
create or replace function workspace()
returns uuid as $$
select (current_setting('request.jwt.claims.app_metadata.workspace', true))::uuid;
$$ language sql stable;
drop policy if exists sel on chat_messages;
create policy sel on chat_messages for select
using(
workspace = workspace()
);
begin;
set local role postgrest_test_anonymous;
select set_config('request.jwt.claims.app_metadata.workspace', '2156a8d9-d06f-40cd-adf8-411ae8c0bc76', true);
explain analyze SELECT COUNT(*) FROM chat_messages;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Aggregate (cost=87937.00..87937.01 rows=1 width=8) (actual time=1252.412..1252.413 rows=1 loops=1)
-> Seq Scan on chat_messages (cost=0.00..87937.00 rows=1 width=0) (actual time=1252.409..1252.409 rows=0 loops=1)
Filter: (workspace = (current_setting('request.jwt.claims.app_metadata.workspace'::text, true))::uuid)
Rows Removed by Filter: 3000000
Planning Time: 0.292 ms
Execution Time: 1252.451 ms
The query is now x14 faster, without adding an index. So the JSON GUC parsing is indeed a problem.
My proposal would be to have an allowlist config that brings back the legacy GUCs for certain keys to avoid the JSON parsing problem. Could be something like
http-context-shortcuts = "request.jwt.claims.app_metadata.workspace, request.headers.user_agent, request.cookies.session_id"
This seems related to https://github.com/PostgREST/postgrest/issues/1941, maybe the config proposed there could be merged with this one.
The query is now x14 faster, without adding an index. So the JSON GUC parsing is indeed a problem.
I still think it's only a problem when you parse it 3 million times like in your example on a seq scan. But that's not the GUCs fault.
My proposal would be to have an allowlist config that brings back the legacy GUCs for certain keys to avoid the JSON parsing problem.
You can easily do this in a pre-request function. Just set your custom GUC to the extracted value from the JSON.
I'd rather wait for schema variables to finally come to PG - because those will solve this nicely.
You can easily do this in a pre-request function. Just set your custom GUC to the extracted value from the JSON.
Yeah, that would work for avoiding the JSON guc but it will also come with a perf decrease because we don't have pipeline mode. Last time I checked adding a pre-request
was like 15% reduction in throughput.
I'd rather wait for schema variables to finally come to PG - because those will solve this nicely.
True. Have you reviewed that feature by any chance? I wonder if it will land on pg16 and if indeed we can use it as we intend to.
I wonder if it will land on pg16
That's a downside I guess, it will be almost a year for pg16.
I still think it's only a problem when you parse it 3 million times like in your example on a seq scan. But that's not the GUCs fault.
As I mentioned above the example is contrived. But there are some real world queries with big tables where filtering on some millions of rows is legitimate. I do think the JSON GUC is a problem here since the perf problem doesn't happen on the legacy gucs.
The query is now x14 faster, without adding an index. So the JSON GUC parsing is indeed a problem.
In my environment only x3 (PG 13).
Hm, I've tried on pg13(with nix-shell
and postgrest-with-postgresql-13 psql
) and I get a x9 perf increase.
And speed up when PARALLEL RESTRICTED
Circling back on this. How about this config:
# this is the default
http-context = "request.method, request.path, request.jwt.claims, request.headers, request.cookies"
# we also set `search_path` and `role` for every request, but those cannot be modified as they're core to our security model
GUCs can be removed and shortcuts can be added:
http-context = "request.jwt.claims.app_metadata.workspace, request.headers"
For every sub level on request.jwt.claims.sub-level
/request.headers.sub-level
we'll get the value of the sub-level
json key.
This will solve this issue and https://github.com/PostgREST/postgrest/issues/1941.
I think it will be useful even if we can do it with pre-request
later.
Just found out a plain SQL solution to the above issue, we can force computing the current_setting
once with a MATERIALIZED CTE. So we have to change the above policy to:
drop policy if exists sel on chat_messages;
create policy sel on chat_messages for select
using(
(
with cached as materialized(
select (app_metadata() ->> 'workspace')::uuid as val
)
select workspace = val from cached
)
);
Now the policy is still correct and the COUNT is fast:
begin;
set local role to postgrest_test_anonymous;
select set_config('request.jwt.claims', '{"aal":"aal1","amr":[{"method":"password","timestamp":1670987059}],"app_metadata":{"provider":"email","team":"T01TQ1XH8RW","workspace":"2156a8d9-d06f-40cd-adf8-411ae8c0bc76","workspace_int":195,"workspace_small_text":"195"},"aud":"authenticated","email":"user@email.com","exp":1671591861,"phone":"","role":"authenticated","session_id":"9033fca4-07e0-42ff-9e2f-0b0aa738e3d5","sub":"ef79d643-550e-4e65-a963-ad6f50e67dfe","user_metadata":{}}', true);
SELECT COUNT(*) FROM chat_messages;
count
---------
1500000
explain analyze SELECT COUNT(*) FROM chat_messages;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=241687.00..241687.01 rows=1 width=8) (actual time=775.210..775.211 rows=1 loops=1)
-> Seq Scan on chat_messages (cost=0.00..237937.00 rows=1500000 width=0) (actual time=0.145..728.419 rows=1500000 loops=1)
Filter: (SubPlan 3)
Rows Removed by Filter: 1500000
SubPlan 3
-> Result (cost=0.05..0.06 rows=1 width=1) (actual time=0.000..0.000 rows=1 loops=3000000)
CTE cached
-> Result (cost=0.00..0.03 rows=1 width=16) (actual time=0.054..0.056 rows=1 loops=1)
InitPlan 2 (returns $1)
-> CTE Scan on cached (cost=0.00..0.02 rows=1 width=16) (actual time=0.058..0.060 rows=1 loops=1)
Planning Time: 0.238 ms
Execution Time: 775.264 ms
This should be added to the docs
This is epic! Also works wonders for things like postgres realtime too!
drop policy if exists sel on chat_messages; create policy sel on chat_messages for select using( ( with cached as materialized( select (app_metadata() ->> 'workspace')::uuid as val ) select workspace = val from cached ) );
Note the above policy with the materialized CTE is not always a win. If the column is indexed, the above policy results in a noticeable slowdown(almost x6 slower) decrease:
create index on chat_messages(uuid);
drop policy if exists sel on chat_messages;
create policy sel on chat_messages for select
using(
(
with cached as materialized(
select (app_metadata() ->> 'workspace')::uuid as val
)
select workspace = val from cached
)
);
begin;
set local role to postgrest_test_anonymous;
select set_config('request.jwt.claims', '{"aal":"aal1","amr":[{"method":"password","timestamp":1670987059}],"app_metadata":{"provider":"email","team":"T01TQ1XH8RW","workspace":"2156a8d9-d06f-40cd-adf8-411ae8c0bc76","workspace_int":195,"workspace_small_text":"195"},"aud":"authenticated","email":"user@email.com","exp":1671591861,"phone":"","role":"authenticated","session_id":"9033fca4-07e0-42ff-9e2f-0b0aa738e3d5","sub":"ef79d643-550e-4e65-a963-ad6f50e67dfe","user_metadata":{}}', true);
explain analyze SELECT COUNT(*) FROM chat_messages;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=209086.43..209086.44 rows=1 width=8) (actual time=817.665..817.666 rows=1 loops=1)
-> Index Only Scan using chat_messages_workspace_idx on chat_messages (cost=0.43..205336.43 rows=1500000 width=0) (actual time=0.098..776.944 rows=1500000 loops=1)
Filter: (SubPlan 2)
Rows Removed by Filter: 1500000
Heap Fetches: 0
SubPlan 2
-> CTE Scan on cached (cost=0.03..0.05 rows=1 width=1) (actual time=0.000..0.000 rows=1 loops=3000000)
CTE cached
-> Result (cost=0.00..0.03 rows=1 width=16) (actual time=0.055..0.057 rows=1 loops=1)
Planning Time: 0.240 ms
Execution Time: 817.720 ms
While the original policy without the materialized CTE from above:
drop policy if exists sel on chat_messages;
create policy sel on chat_messages for select
using(
workspace = (app_metadata() ->> 'workspace')::uuid
);
begin;
set local role to postgrest_test_anonymous;
select set_config('request.jwt.claims', '{"aal":"aal1","amr":[{"method":"password","timestamp":1670987059}],"app_metadata":{"provider":"email","team":"T01TQ1XH8RW","workspace":"2156a8d9-d06f-4
0cd-adf8-411ae8c0bc76","workspace_int":195,"workspace_small_text":"195"},"aud":"authenticated","email":"user@email.com","exp":1671591861,"phone":"","role":"authenticated","session_id":"9033fca
4-07e0-42ff-9e2f-0b0aa738e3d5","sub":"ef79d643-550e-4e65-a963-ad6f50e67dfe","user_metadata":{}}', true);
explain analyze SELECT COUNT(*) FROM chat_messages;
BEGIN
SET
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=35164.45..35164.46 rows=1 width=8) (actual time=124.649..124.650 rows=1 loops=1)
-> Index Only Scan using chat_messages_workspace_idx on chat_messages (cost=0.45..31414.95 rows=1499800 width=0) (actual time=0.082..78.337 rows=1500000 loops=1)
Index Cond: (workspace = ((((current_setting('request.jwt.claims'::text, true))::jsonb -> 'app_metadata'::text) ->> 'workspace'::text))::uuid)
Heap Fetches: 0
Planning Time: 0.353 ms
Execution Time: 124.694 ms
(6 rows)
So maybe we can just boil down to "always use an index when comparing against current_setting()
values, if not possible make sure to cache them in a materialized CTE".
There is a better way to force pg to execute the app_metadata
function only once for the query: Make it use an InitPlan
.
Instead of this:
create policy sel on chat_messages for select
using(
workspace = (app_metadata() ->> 'workspace')::uuid
);
You'll need to wrap the right side in a scalar subquery:
create policy sel on chat_messages for select
using(
workspace = (select (app_metadata() ->> 'workspace')::uuid)
);
Since the subquery does not depend on the outer query, it will result in an InitPlan. This will run it exactly once, and then put the result into the main query which can use an index or not, it doesn't matter - it will be fast in both cases.
First policy explained:
Aggregate (cost=121660.00..121660.01 rows=1 width=8) (actual time=15673.547..15673.548 rows=1 loops=1)
-> Seq Scan on chat_messages (cost=0.00..117937.00 rows=1489200 width=0) (actual time=0.026..15614.352 rows=1500000 loops=1)
Filter: (workspace = ((((current_setting('request.jwt.claims'::text, true))::jsonb -> 'app_metadata'::text) ->> 'workspace'::text))::uuid)
Rows Removed by Filter: 1500000
Planning Time: 0.281 ms
Execution Time: 15673.592 ms
Second policy explained:
Aggregate (cost=69187.03..69187.04 rows=1 width=8) (actual time=264.203..264.204 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Result (cost=0.00..0.03 rows=1 width=16) (actual time=0.028..0.029 rows=1 loops=1)
-> Seq Scan on chat_messages (cost=0.00..65437.00 rows=1500000 width=0) (actual time=0.052..214.629 rows=1500000 loops=1)
Filter: (workspace = $0)
Rows Removed by Filter: 1500000
Planning Time: 0.138 ms
Execution Time: 264.229 ms
Second policy + index:
Aggregate (cost=35168.46..35168.47 rows=1 width=8) (actual time=151.675..151.677 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Result (cost=0.00..0.03 rows=1 width=16) (actual time=0.027..0.027 rows=1 loops=1)
-> Index Only Scan using chat_messages_workspace on chat_messages (cost=0.43..31418.43 rows=1500000 width=0) (actual time=0.049..98.173 rows=1500000 loops=1)
Index Cond: (workspace = $0)
Heap Fetches: 0
Planning Time: 0.144 ms
Execution Time: 151.703 ms
interesting, so should be the same performance as materialized CTE but a bit less verbose to write?
interesting, so should be the same performance as materialized CTE but a bit less verbose to write?
No, it has better performance than a materialized CTE.
I did a quick test in https://github.com/orgs/supabase/discussions/9311#discussioncomment-5732511 and get the same result with either method on a quick test, I'll do more. It makes sense they would be the same as they set a constant value to compare rows against and get run once.
BUT I believe this method is the one scattered thru out my original table https://github.com/orgs/supabase/discussions/9311#discussioncomment-3815869 which did things like:
I did just run a 2nd test of the original auth.uid() = user_id 25K rows, and get 43, 13, 3 for original CTE, init with non_indexed user_id .07, 13, .13 for orginal, CTE, init with indexed user_id
So the initplan way is faster in this case. If you have an indexed column it seems to take advantage of it, but still slower than just the indexed column test.
The difference you see between initplan and without for an indexed query is negligible. In theory, it's a very tiny overhead of running an additional initplan node in the query. But doing a few more runs gives me the following results:
Bare query using index, but no initplan:
Scalar subquery using index and initplan:
There's a lot more noise here than any noticeable difference between the two.
Added some more tests a realistic two part RLS policy here with code: https://github.com/orgs/supabase/discussions/9311#discussioncomment-5734798
is_admin() OR auth.uid() = user_id
100K selects, no difference between indexed and not. Offset is adding limit 10 offset 90000
but only for admin as user gets 1 row.
wrapped is initplan
role | normal | wrapped | CTE |
---|---|---|---|
user | 1900 | 10.3 | 53 |
admin | 1900 | 18.5 | 18.5 |
offset 90k | 1900 | 48 | 48 |
Edit... Adding where user_id = 'UUID'
to the query takes the time down to... 0.2msec from 10 and 53 for a user query.
Edit: solution below.
Having:
Doing a count is fast without RLS:
When RLS is enabled, the count is awfully slow:
This happens because:
::jsonb
) is not LEAKPROOF , this means that the casting will be done for each row on the query, PostgreSQL doesn't cache this static value.json
value inside therequest.jwt.claims
is somewhat lengthy. We have no control over this as the JWT can come from an external system. A shorterjson
claims will lead to less time:explain analyze SELECT COUNT(*) FROM chat_messages;
Aggregate (cost=117937.00..117937.01 rows=1 width=8) (actual time=3449.217..3449.218 rows=1 loops=1) -> Seq Scan on chat_messages (cost=0.00..117937.00 rows=1 width=0) (actual time=3449.213..3449.213 rows=0 loops=1) Filter: (workspace = ((((current_setting('request.jwt.claims'::text, true))::jsonb -> 'app_metadata'::text) ->> 'workspace'::text))::uuid) Rows Removed by Filter: 3000000 Planning Time: 0.259 ms Execution Time: 3449.257 ms