PostgREST / postgrest

REST API for any Postgres database
https://postgrest.org
MIT License
23.51k stars 1.03k forks source link

Configurable query grammar #2805

Open steve-chavez opened 1 year ago

steve-chavez commented 1 year ago

Problem

Solution

Make the query grammar configurable. This can be paired up with pre-config.

Say I want to only allow eq operators and only on primary keys:

-- pgrst.query_grammar is the config prefix 
-- '*' is the default "allow all"
-- ' ' empty is "disallow all"
select set_config('pgrst.query_grammar.op', ' ', true);

-- op.eq is the allowlist for columns
-- here we add all the pks 
select set_config('pgrst.query_grammar.op.eq', array_agg(c.table_name || '.' || c.column_name)::text, true) 
from information_schema.table_constraints tc
join information_schema.constraint_column_usage as ccu using (constraint_schema, constraint_name)
join information_schema.columns as c
  on c.table_schema = tc.constraint_schema and tc.table_name = c.table_name and ccu.column_name = c.column_name
where constraint_type = 'PRIMARY KEY' and c.table_schema = 'test';

-- obtain all the pks allowed
select current_setting('pgrst.query_grammar.op.eq') as eq_allowed;

eq_allowed | {items.id,items2.id,items3.id,clients.id,compound_pk.k1,compound_pk.k2,..

A similar config can be used for order too:

select set_config('pgrst.query_grammar.order', 'table.col ', true);

Notes

steve-chavez commented 1 year ago

Edit: Adding custom operators will be done with https://github.com/PostgREST/postgrest/issues/2028#issuecomment-1584082574

Another problem is adding new operators. Adding PostgREST specific functions for every new operator is cumbersome.

Idea:

select set_config('pgrst.query_grammar.op.groonga', '&@ ', true);

TODO: consolidate these config names.

To handle FTS operators where we translate fts to @@ to_tsquery(2 function calls on the input), we could encode the function composition with . on the setting, like:

select set_config('pgrst.query_grammar.op.fts', '@@ . to_tsquery ', true);

Potentially it could support many function calls. There might be some overlap with https://github.com/PostgREST/postgrest/pull/2523.


Adding operators on the codebase is simple. We just need to pass the config to our parser and sql fragments:

https://github.com/PostgREST/postgrest/blob/1f13e43abec9fb7acda3967d3645d28e6a1f112a/src/PostgREST/ApiRequest/QueryParams.hs#L205-L216

https://github.com/PostgREST/postgrest/blob/1f13e43abec9fb7acda3967d3645d28e6a1f112a/src/PostgREST/Query/SqlFragment.hs#L96-L106

Restricting the operators is harder. Don't have the implementation clear yet.

mitch-lbw commented 1 year ago

Hi @steve-chavez , i recently stumbled upon the same requirement to restrict the type of allowed operators and order criteria to a predefined set of columns, which IMHO would be a very enriching feature. Is the solution you proposed as pre_config function configuring restrictions for operations and order like this select set_config('pgrst.query_grammar.op', '*', true); already working or is this a proposal/idea? At least when adding the line above to a preconfig functionit wasn't able to get my postgrest API delivering any data. Also if added a line to restrict ordering to one column like

select set_config('pgrst.query_grammar.order', 'table.col', true);
select set_config('pgrst.query_grammar.order', 'table.col2', false);

i was not able to get a restriction on ordering. If it is already working, i most probably i made a mistake on configuring and would be thankful for a complete working example :-) Is there also any documentation available? Thanks in advance!

steve-chavez commented 1 year ago

@mitch-lbw No, this is not implemented. There's no consensus on the feature design yet.

steve-chavez commented 8 months ago

A related discussion. User asking about restricting operand values https://github.com/PostgREST/postgrest/discussions/3355.

gc-ft commented 3 months ago

Want to simply re-iterate how important this kind of a feature is to a lot of users in my opinion, at least bigger production setups using PostgREST would benefit from more hardening than is currently available.

Just a tiny quick-fix-suggestion that might not be "great", but still a first step solution while a final one is discussed and decided: publishing just the raw query in request.query (no need to store it parsed or anything on the postgREST end, raw is fine I believe) would be a first step. It would allow making functions on our own which look for certain things in the query string in order to stop the query using RLS.

For example, if the query string is exposed as a simple string in settings, we could write a function which returns true if a certain regex pattern matches request.query ... for example postgrest_query_has('(^|&)id=eq\.') - would return true if an eq filter on the field id is present in the query.

We then simply setup a RLS policy like this, which would automatically filter out all results and return nothing if id filter is missing in the request.

CREATE POLICY select_if_pattern_found
ON your_table_name
FOR SELECT
TO authenticated
USING (
    (SELECT postgrest_query_has('(^|&)id=eq\.'))
);

Of course, this would use the db however it should optimize most of the query away if the match is false. At least it is better than nothing and seems a very easy thing to add to expose as a variable in settings just like you do for path right now.

steve-chavez commented 3 months ago

@gc-ft Right, I think that's the simplest solution, this PR had a similar idea https://github.com/PostgREST/postgrest/pull/1710

Now I wouldn't recommend doing that in RLS since it can get really complicated as it is. But doing the filtering on a pre-request should work fine.