PostgREST / postgrest

REST API for any Postgres database
https://postgrest.org
MIT License
23.44k stars 1.03k forks source link

Add the ability to specify source column for view embedding #1078

Closed mordae closed 6 years ago

mordae commented 6 years ago

Let's say I use the following schema:

CREATE TABLE person
(
    id bigserial NOT NULL,
    name character varying COLLATE pg_catalog."und-x-icu" NOT NULL,
    CONSTRAINT person_pkey PRIMARY KEY (id)
);

CREATE TABLE message
(
    id bigserial NOT NULL,
    sender bigint NOT NULL,
    recipient bigint NOT NULL,
    body text COLLATE pg_catalog."und-x-icu" NOT NULL DEFAULT ''::text,
    CONSTRAINT message_pkey PRIMARY KEY (id),
    CONSTRAINT message_recipient_fkey FOREIGN KEY (recipient)
        REFERENCES person (id) MATCH SIMPLE
        ON UPDATE CASCADE
        ON DELETE CASCADE,
    CONSTRAINT message_sender_fkey FOREIGN KEY (sender)
        REFERENCES person (id) MATCH SIMPLE
        ON UPDATE CASCADE
        ON DELETE CASCADE
);

CREATE OR REPLACE VIEW person_detail AS
 SELECT p.id,
    p.name,
    s.count AS sent,
    r.count AS received
   FROM person p
     JOIN LATERAL ( SELECT message.sender,
            count(message.id) AS count
           FROM message
          GROUP BY message.sender) s ON s.sender = p.id
     JOIN LATERAL ( SELECT message.recipient,
            count(message.id) AS count
           FROM message
          GROUP BY message.recipient) r ON r.recipient = p.id;

Now, when I ask for the key-based table embedding, everything looks fine:

http localhost:5000/message \
  select=='*,sender(*),recipient(*)'

But when I try to embed the view instead, I get something unexpected:

http localhost:5000/message \
  select=='*,sender:person_detail(*),recipient:person_detail(*)'

Both users are the same! Well... the generated SQL looks like this:

SELECT
    "message_1".*,
    row_to_json("person_detail_sender".*) AS "sender",
    row_to_json("person_detail_recipient".*) AS "recipient"

FROM "public"."message" AS "message_1"

LEFT JOIN LATERAL (
    SELECT "person_detail_2".*
    FROM "public"."person_detail" AS "person_detail_2"
    WHERE "person_detail_2"."id" = "message_1"."sender"
) AS "person_detail_sender" ON TRUE

LEFT JOIN LATERAL (
    SELECT "person_detail_2".*
    FROM "public"."person_detail" AS "person_detail_2"
    WHERE "person_detail_2"."id" = "message_1"."sender"
) AS "person_detail_recipient" ON TRUE

Let's see if we can guide it a little bit more closer to the home:

http localhost:5000/message \
  select=='*,sender:person_detail.sender(*),recipient:person_detail.recipient(*)'

Error: Could not find foreign keys between these entities, No relation found between message and person_detail

Well, that's the end of the road, isn't it?

Now for something completely different, what about asking in the other direction, just to see if it works:

http localhost:5000/person_detail \
  select=='*,sent_messages:message.sender(*),received_messages:message.recipient(*)'

Well, this works fine.

Would it be possible to specify the local key column to use when embedding a not-directly-referenced relation?

ruslantalpa commented 6 years ago

try

localhost:5000/message?select=id,body,sender:person_detail.sender(id,name),recipient:person_detail. recipient(id,name)

(avoid using *, always list columns in select)

mordae commented 6 years ago

I have tried both the query you suggested and a variant using the person table instead of the view and PostgREST does not seem to follow. Both failed with "Could not find foreign keys between these entities, No relation found between message and ...".

http localhost:5000/message \
  select=='id,body,sender:person_detail.sender(id,name),recipient:person_detail.recipient(id,name)'

http localhost:5000/message \
  select=='id,body,sender:person.sender(id,name),recipient:person.recipient(id,name)'

Help?

ruslantalpa commented 6 years ago

Did you restart PostgREST after you created your view. Try restarting

Also can you paste complete url string, not http commands, it’s hard to follow and not sure how it’s syntax works

On 15 Mar 2018, at 16:19, Jan Hamal Dvořák notifications@github.com wrote:

I have tried both the query you suggested and a variant using the person table instead of the view and PostgREST does not seem to follow. Both failed with "Could not find foreign keys between these entities, No relation found between message and ...".

http localhost:5000/message \ select=='id,body,sender:person_detail.sender(id,name),recipient:person_detail.recipient(id,name)'

http localhost:5000/message \ select=='id,body,sender:person.sender(id,name),recipient:person.recipient(id,name)' Help?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

mordae commented 6 years ago

Yes, I have restarted PostgREST several times and have tried current master built using stack to make sure I am using the most recent version available.

curl 'localhost:5000/message?select=id,body,sender:person.sender(id,name),recipient:person.recipient(id,name)'
# {"message":"Could not find foreign keys between these entities, No relation found between message and person"}

I have trouble following the code. It's mostly not commented and functions are dense and long. 11 indentation levels in addRelations is simply too much. I would normally try to submit a patch or at least sketch a possible solution, but I was not able to. Sorry.

steve-chavez commented 6 years ago

@mordae When you specify the person_detail.<colname> part in

GET /message?select=*,sender:person_detail.sender(*),recipient:person_detail.recipient(*)

The <colname>(sender or recipient) part is meant to reference a column of person_detail, and as per your schema that view doesn't have that column, that's why you get the no relation detection error(this error message could be improved), also that's why this request works:

GET /person_detail?select=*,sent_messages:message.sender(*),received_messages:message.recipient(*)

Since sender,recipient are columns of message.

That feature(introduced in #918) only allows to specify the target table/column of the JOIN but not the source column.

mordae commented 6 years ago

@steve-chavez I suspected that to be the reason. Consider this a feature request.

pododesk commented 6 years ago

@jg nein, leider nicht

David Getto

reach us at +49 621 681 27533

begriffs/postgrest on March 19, 2018 at 12:52pm wrote:

@steve-chavez ( https://github.com/steve-chavez )I suspected that to be the reason. Consider this a feature request.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub ( https://github.com/begriffs/postgrest/issues/1078#issuecomment-374186624 ) , or mute the thread ( https://github.com/notifications/unsubscribe-auth/AiSecZirQwNTgX8gOKt5g4_OQRpvGmAPks5tf5vfgaJpZM4SaStv ) .

steve-chavez commented 6 years ago

This would require to extend pgrst with an additional operation in the url, also disambiguating embeds can get more complicated with composite keys.

To provide a more generic solution without complicating the url too much how about if we disambiguate embeds based on foreign keys(the CONSTRAINT names), an example:

-- Using #918 to disambiguate
GET /person_detail?select=*,sent_messages:message.sender(*),received_messages:message.recipient(*)

-- FK Proposal
GET /person_detail?select=*,sent_messages:message_sender_fkey(*),received_messages:message_recipient_fkey(*)

This would be enough if we only had tables but since we have views that have artificial relationships we need to additionally(with the + url safe character) provide a table/view name for pgrst to do the embed, this would solve this issue:

GET /message?select=*,sender:message_sender_fkey+person_detail(*),recipient:message_recipient_fkey+person_detail(*)

@begriffs @ruslantalpa Thoughts?

ruslantalpa commented 6 years ago

i for example don't have them named specifically, they are auto generated so i am not sure a lot of people give much attention to their name. If you start using constraint names, then you need to start looking into their definitions (to see what columns they are using) so this would involve a lot of changes.

Why can't we use column.table format? We already have this format for the other relations (table.table for many-many and table.column on-many) so depending on the order of the column/table strings, it could inform the relation type in a way (this is off the top of my head, need to review the current state since we support omitting the fk name when we can guess the relation, we'd need a list of all the current formats and see how the new feature/format fits in).

steve-chavez commented 6 years ago

For me the table.column/table.table/column.table notation is confusing, table.column makes me think this notation it's related to pg dot notation syntax but then table.table/column.table don't follow.

I think this notation(same for Child/Parent):

GET /person_detail?select=*,sent_messages:message+sender(*) -- Child
GET /message?select=*,sender:person_detail+sender(*) -- Parent

Would be more intuitive because it reflects the operation of embedding the table plus + adding precision(both requests will work without + but they'll embed the first found Relation) to get the desired result.

About the FK constraint I'll refrain from discussing that until there's an issue about composite keys, the + notation could be extended to include it later.

steve-chavez commented 6 years ago

For now I've added the option to use the same notation for specifying the source column, now this works:

GET /message?select=*,sender:person_detail.sender(*),recipient:person_detail.recipient(*)
steve-chavez commented 6 years ago

Forgot there was a problem with the + character(https://github.com/begriffs/postgrest/pull/864#issuecomment-298828830), so the person_detail.sender notation will stay.

mordae commented 6 years ago

@steve-chavez Your self-reference-fix branch (with the above commit) works for me.

I am a little bit nervous about the number of possible meanings of the detail field, though. The selectors might benefit from some disambiguation. For example:

selectors := <empty> | <selector> | <selector> "," <selectors>
selector  := <alias> ":" <local-col> ":" <remote-rel> "." <remote-col> "(" <selectors> ")"

It would be possible to use full path specification (with unlimited remote part repeats), but that might open the door to some serious asymmetrical data DoS attacks, so the depth would have to be limited or some other way to make it safe would have to be devised.

selectors    := <empty> | <selector> | <selector> "," <selectors>
selector     := <alias> ":" <local-col> ":" <remote-path> "(" <selectors> ")"

remote-path  := <remote-head> <remote-links> ":" <remote-tail>
remote-head  := <remote-rel> "." <fwdref-col>
remote-tail  := <backref-col> "." <remote-rel>
remote-links := <empty> | <remote-link> | <remote-link> <remote-links>
remote-link  := ":" <backref-col> "." <remote-rel> "." <fwdref-col>

For example:

GET /person?select==id,items:id:person.cart_item.item:id.item

In any case, I don't think that trying to compress everything in a single detail field will work out for large databases with overlapping names.

steve-chavez commented 6 years ago

@mordae About limiting the depth of embeds there's some ideas in this issue https://github.com/begriffs/postgrest/issues/249#issuecomment-334629208, main idea is to do it a the pg level by limiting the query cost, also if you have suggestions for improving pgrst query grammer check https://github.com/begriffs/postgrest/issues/1017#issuecomment-346981443.