PostgREST / postgrest

REST API for any Postgres database
https://postgrest.org
MIT License
23.18k stars 1.02k forks source link

Feature request: column to column comparison #1105

Closed priyank-purohit closed 3 years ago

priyank-purohit commented 6 years ago

Description of issue

Currently PostgREST supports only column-to-value(s) comparison, however column-to-column comparisons are extremely common and too important to not support.

Implementation

This is a somewhat tricky feature to implement because you need to differentiate values from column names. The best solution I came up with is to add a .c. prefix between the operator and the column.

For example, ?and=(first_name.c.eq.last_name) vs ?and=(first_name.eq.last_name) to search for people with same first and last names vs. people whose first name is in the db as literal string 'last_name'.

Another potential solution: duplicate each operator (where it makes sense the user wants to do column comparison), and this duplicate operator (with a slightly different name) will specifically interpret the value supplied by the user as a column name. If the value supplied is not a column name, 404 response. For example, the .eq. operator's column comparison equivalent could be .ceq.

steve-chavez commented 6 years ago

Why not define that in the view you're exposing?, say like:

create view api.persons as (
  select id, first_name, last_name, first_name = last_name as is_tautonym 
  from private.persons
);

A computed column will also work:

create or replace function is_tautonym(api.persons) returns boolean as $$
  select $1.first_name = $1.last_name;
$$ language SQL stable;

Then you could do:

/persons?is_tautonym=is.true

For now I'd be against implementing this, as it's more clean to represent that kind of comparisons in the db.

priyank-purohit commented 6 years ago

@steve-chavez That works for that one particular query, but not for any other queries (you can create any column-to-value comparison query otherwise). Is there any other reason besides the cleanliness of the PostgREST code?

Alternatively, I could implement something like this in a custom version of PostgREST; but I'm wondering if would this be a difficult undertaking for someone with 0 Haskell knowledge?

steve-chavez commented 6 years ago

Considering the difficulty required towards implementing this and defining views/computed columns in a few lines of SQL, I'd say the latter is a much more easy option.

I think this kind of comparisons are core business logic, and as so, they should be defined at the db level with an appropriate name/comment on the column so the REST API can have proper documentation, as your project matures you're gonna need this.

Also there needs to be a limit in what kind of operations PostgREST exposes, having this somehow feels like a step closer to exposing custom functions in filters values.

priyank-purohit commented 6 years ago

I would say that something like 'regular expression comparison operator' is a use-case specific feature, as in not everyone will want/need that (and it increases risk of ReDoS). Hence, that is something you can expose via custom functions if you need.

I think of PostgREST as a way to create API for any PostgreSQL db, and a way to execute some common queries. I would say that SELECT * FROM table WHERE column1 > column2 AND column1 < column3 are examples of basic SQL queries that most people would find themselves executing. For example, I'm using this with sports and biological databases, both need to do those searches. I can create custom functions for each of the 100+ use cases, or have a schema-independent way to compare columns built into this tool!

I'll try to figure out how to implement this, but if someone experienced on contributing to this project can point me in the right direction (i.e. which file should I start at, which files would need to be modified to handle URL parsing and executing queries), I'd appreciate it!

steve-chavez commented 6 years ago

Better to drop the .c. prefix if you want to do a PR, as this would make your job harder on the Parsers module and looks somewhat confusing on the url, didn't give it much thought but maybe another set of operators like col_{eq,gt,is} would be more appropriate, this needs more discussion.

steve-chavez commented 6 years ago

Giving it more thought, the only operators that make sense on column to column comparison are: eq, neq, lt, lte, gt, gte.

I think we can have another HashMap of operators here, we'd need another constructor for them, those col_eq, col_neq, col_lt, col_lte, col_gt, col_gte would need to be parsed, and after that they would have to be converted to SQL.

We would also need another set of tests for this, making sure that computed columns comparisons work as well.

ruslantalpa commented 6 years ago

the need to strictly compare column values is very rare. It's even less likely that those two columns are in the same table (and this is what is being suggested/requested). usually you want to apply some kind of function to a value of a column before comparing it to another and implementing that is complicated and at the same time exposing that is dangerous.

As a test for my theory, let's see here examples of real tables with two columns where people would need to compare them.

priyank-purohit commented 6 years ago

I currently use PostgREST to work with a massive biological database. I thought there was some problem with the way it was created so I needed to do some validation.

There is a PredictedGenes table which has a bunch of columns, including: GeneStart, GeneEnd, Strand. I needed to check that Strand = '+' and GeneStart > GeneEnd and Strand = '-' and GeneStart < GeneEnd

I definitely agree that being able to do complex manipulation of column values for comparison is great and needed pretty frequently. However, there are also other Postgres functions that are commonly used, yet not supported by PostgREST (GROUP BY for example).

That been said, I personally can't work on this at the moment, busy with wrapping up my degree; plus I have 0 knowledge of Haskell (tried to follow Steve's dev guide, but can't make heads or tails of the language atm). :( But if this was implemented, I would definitely implement it in my application!

steve-chavez commented 6 years ago

@priyank-purohit For validating data, seems more suitable to query your PostgreSQL db directly(with psql or any client) than to go through PostgREST.

I hesitate on including the column to column comparison feature on master for now, would like to see more compelling use cases first.

sinogermany commented 6 years ago

Probably not so relevant - we are currently trying to do:

UPDATE <table> SET <col1> = <col2> WHERE <blah>

If we could introduce literals into the PATCH payload, that would've been really great.
I.e. something like:

PATCH /table_name?<blah>
{ "col1": "literal(col2)" }
wolfgangwalther commented 3 years ago

The conclusion seems to be that this will not be implemented right now, therefore I'm closing this.

For more complex scenarios then what the query string syntax has to offer, we can already use some very powerful tools: Views, Stored Procedures and Computed Columns.

Taking the example mentioned above:

There is a PredictedGenes table which has a bunch of columns, including: GeneStart, GeneEnd, Strand. I needed to check that Strand = '+' and GeneStart > GeneEnd and Strand = '-' and GeneStart < GeneEnd

This can be solved with a computed column function that just returns -1, 0, or 1 for the comparison of GeneStart and GeneEnd. This computed column can then be queried with the regular syntax.

steve-chavez commented 2 years ago

Just noted that @sinogermany use case above can be solved with JSON Patch copy operation. So relating to https://github.com/PostgREST/postgrest/issues/465.