Closed Pigeo closed 3 years ago
As a real (usable in production) work-around, it would be great to be able to manually specify foreign keys in views
@Pigeo You're right. Thanks for reporting.
The many to many relationship detection between films-actors is not working on VIEWs.
While this seems to be related to VIEWs, it's not as simple as that.
Put both the tables and the views in the exposed schema. Now, with the views in place the same query to the tables will fail as well.
This does not work: GET /actors?select=films(title)
But this does work: GET /actors?select=films!personnages(title)
Now remove the personnages
VIEW: Both requests work. Both to the tables, as well as to the remaining views.
Once a view of the junction table is available, there will be two ways to do the m2m embedding: through the table directly, or through the view (which is effectively the same, because we can't do anything with the additional columns on the junction table right now). But instead of a disambiguation error, we just get a "could not find any relationship" error. This is the bug.
- But this does work:
GET /actors?select=films!personnages(title)
In my public schema that only contains views (respectively named films
, actors
, and personnages
), this last syntax doesn't work neither (from your explanations, I understand this is because there's an ambiguity between the personnages
VIEW located in my public schema and the personnages
TABLE located in my private schema).
Now remove the
personnages
VIEW: Both requests work. Both to the tables, as well as to the remaining views.Once a view of the junction table is available, there will be two ways to do the m2m embedding
If I remove the personnages
VIEW from my public schema, then it says permission denied for schema Private_Schema
(since the TABLE personnages
is located in my "Private_Schema"...)
=> If you're suggesting I should put some real tables in my public schema in order to get it working, I think this is not a good solution.
Basically, the problem is :
personnages
, then there's ambiguity and even with !personnages
syntax it won't work (it says Could not find foreign keys between these entities. No relationship found between actors and films
)permission denied
, since the TABLE is located in the private schema
So, neither option is working for me.Do you have any suggestion (other than removing the VIEW and putting the TABLE in the public schema, which is generally considered bad practice from what I understand)?
- But this does work:
GET /actors?select=films!personnages(title)
In my public schema that only contains views (respectively named
films
,actors
, andpersonnages
), this last syntax doesn't work neither (from your explanations, I understand this is because there's an ambiguity between thepersonnages
VIEW located in my public schema and thepersonnages
TABLE located in my private schema).
Ah, interesting. I had the view and the table in the same schema, so they had different names. So even with the hint "it does not work" for you. Does that mean "no relationship found" or "406 ambiguity"?
If I remove the
personnages
VIEW from my public schema, then it sayspermission denied for schema Private_Schema
(since the TABLEpersonnages
is located in my "Private_Schema"...)=> If you're suggesting I should put some real tables in my public schema in order to get it working, I think this is not a good solution.
I am not suggesting that, this was just part of trying to identify the problem.
Basically, the problem is :
* if I create a VIEW of the junction table `personnages`, then there's ambiguity and even with `!personnages` syntax it won't work (it says `Could not find foreign keys between these entities. No relationship found between actors and films`) * if I don't create a VIEW of the junction table, then it says `permission denied`, since the TABLE is located in the private schema So, neither option is working for me.
I think giving permissions to the data table in the private schema is perfectly alright. If you don't need the view because of other columns, but just as a junction table, you can remove the view and just use the !personnages
hint.
Using permissions on the base tables will allow you to use row level security with sql users as well - that's something that is not possible with views, because those are effectively SECURITY DEFINER
. That's one of the big limitations of the whole "views in the exposed schema / tables in a private schema" approach.
Do you have any suggestion (other than removing the VIEW and putting the TABLE in the public schema, which is generally considered bad practice from what I understand)?
What you can currently do is use different names for the tables and the views. So if you name your junction table data_personnages
and the view personnages
then you will be able to embed with the !personnages
hint.
Down the road we will of course try to fix this, so that the relationship can be detected.
So even with the hint "it does not work" for you. Does that mean "no relationship found" or "406 ambiguity"?
HTTP 404
+ the following body :
{
"message": "Could not find foreign keys between these entities. No relationship found between actors and films"
}
Ok, thanks for clarifying.
I think giving permissions to the data table in the private schema is perfectly alright. If you don't need the view because of other columns, but just as a junction table, you can remove the view and just use the
!personnages
hint.
Actually, I need the view, because my client asks that it has to be exposed in the REST API.
Using permissions on the base tables will allow you to use row level security with sql users as well - that's something that is not possible with views, because those are effectively
SECURITY DEFINER
. That's one of the big limitations of the whole "views in the exposed schema / tables in a private schema" approach.
I'm not using row level security for the moment, but I thought it was possible on VIEWs, by using CREATE VIEW
... WITH (security_barrier)
, as explained in PostgreSQL documentation :
When it is necessary for a view to provide row level security, the security_barrier attribute should be applied to the view
What you can currently do is use different names for the tables and the views. So if you name your junction table
data_personnages
and the viewpersonnages
then you will be able to embed with the!personnages
hint.
While I can confirm this works (I just tested it), it's not very practical to tell end users that for some obscure reason they have to add that hint there. I'll have to implement a rewrite rule in the API gateway in order to automatically add the hint for them... But at least that's a reasonable work around until the bug is fixed.
Thanks a lot for your investigations and suggestions!
I'm not using row level security for the moment, but I thought it was possible on VIEWs, by using
CREATE VIEW
...WITH (security_barrier)
, as explained in PostgreSQL documentation :When it is necessary for a view to provide row level security, the security_barrier attribute should be applied to the view
With this approach you basically have to recreate the RLS manually in your VIEWs. It works, but imho it's not as nice. And afaik you can be more fine-grained with RLS about the differences between INSERT / UPDATE / DELETE, that's not possible with the security_barrier approach.
Yes it works. If postgres had SECURITY INVOKER
views however, that would make life much easier.
While I can confirm this works (I just tested it), it's not very practical to tell end users that for some obscure reason they have to add that hint there. I'll have to implement a rewrite rule in the API gateway in order to automatically add the hint for them... But at least that's a reasonable work around until the bug is fixed.
Using hints is something that I suggest you make your users do in any case. This is because once you create other tables/views that happen to form a relationship between your source tables as well, suddenly your query will break without the hint, because of ambiguity.
Or, let me rephrase that: In your example above the request /actors?select=films(title)
fails now with "relationship not found". Once we fix the bug, it will fail with "406 ambiguity". It is expected that you will need to add the hint here. Covering this behind a rewrite rule seems not useful to me.
Once we fix the bug, it will fail with "406 ambiguity". It is expected that you will need to add the hint here
There's no real ambiguity here, since the TABLE is not accessible (because it is located in the private schema). Trying to use the TABLE by specifying appropriate hint results in error message permission denied
, as explained above. So, imho PostgREST should be smart enough to see that there's only one working path here, and should assume it instead of saying that there's a (fake) ambiguity. But I understand that adding that level of intelligence and sophistication takes more time and is maybe not the priority.
Moreover, what do you mean exactly by "ambiguity" ? Having 2 junction TABLES with different names (let's say personnages
and personnages2
) is one kind of ambiguity (several possible ways to make the join), whereas having one TABLE and one VIEW in different schemas sharing the same name is a totally other kind of ambiguity (several items having the same name). I agree the first kind of ambiguity can be resolved by using hint to explicitly name the junction TABLE to be used, whereas the second kind of ambiguity can not be resolved by using hint, since both the TABLE and the VIEW have exactly the same name.
Let's put it differently : in order to remove the ambiguity, you're suggesting me to rename the VIEW, so that TABLE and VIEW don't have the same name anymore, and then to use a hint in order to disambiguate something that wasn't really ambiguous.
Why not consider instead that as soon as you have a VIEW in the public schema which has the same name than a TABLE in the private schema, then the hint should always use the VIEW from the public schema (instead of saying there's an ambiguity of type 2)? (look at how it works in PostgreSQL : if you don't prefix the table name with the schema name in your SQL query, it assumes you're talking about the item from the current schema, and doesn't say it's ambiguous...)
I don't know exactly what can be done and what should be done, but I do believe that there is a good margin for progress here.
There's no real ambiguity here, since the TABLE is not accessible (because it is located in the private schema). Trying to use the TABLE by specifying appropriate hint results in error message
permission denied
, as explained above. So, imho PostgREST should be smart enough to see that there's only one working path here, and should assume it instead of saying that there's a (fake) ambiguity. But I understand that adding that level of intelligence and sophistication takes more time and is maybe not the priority.
Yeah, I think is a bit more complicated. To be able to make the decision based on permissions, we would need to load the whole database schema with the current role or alternatively load all the permissions for all database users together with the structure. I don't see any of those two happening at all, because this would make things very complicated and slow.
Plus: All of a sudden you would need to take your own permissions into account on the client-side when making the request. Should I add a hint or not? Depends on my database permissions. Uh.
Moreover, what do you mean exactly by "ambiguity" ? Having 2 junction TABLES with different names (let's say
personnages
andpersonnages2
) is one kind of ambiguity (several possible ways to make the join), whereas having one TABLE and one VIEW in different schemas sharing the same name is a totally other kind of ambiguity (several items having the same name). I agree the first kind of ambiguity can be resolved by using hint to explicitly name the junction TABLE to be used, whereas the second kind of ambiguity can not be resolved by using hint, since both the TABLE and the VIEW have exactly the same name.
You are right, there are different kind of ambiguities. For one, it seems like having the junction as a VIEW or as a TABLE doesn't make a difference, because the it's the same base table and the same foreign keys that are used for the join. In theory it could even be possible to merge those into the same relationship, with the functionality we have right now. But this will not hold up in the future anymore, when we extend the embedding feature: We're currently thinking about implementing some way to include the non-fk columns of the junction table in this output as well. Once we do that, there might be a real difference between using the VIEW or the TABLE.
Then other point you're bringing up is, that it seems to be impossible to disambiguate some of those m2m relationships, when VIEW and TABLE have the same name. I think I faced the same problem as well already. The only way to do that would be to add the schema in question in the hint as well. I'm not sure whether that's currently supported or not. If I remember correctly, the disambiguation error message shows those tables qualified with the schema already. So yes, this is something that we might still have to figure out.
Let's put it differently : in order to remove the ambiguity, you're suggesting me to rename the VIEW, so that TABLE and VIEW don't have the same name anymore, and then to use a hint in order to disambiguate something that wasn't really ambiguous.
Unfortunately, ambiguous to you is not the same as ambiguous to PostgREST ;). But yeah, I can see where you're coming from. I definitely agree that we need a proper way to solve this "same name, same relationship" case.
Why not consider instead that as soon as you have a VIEW in the public schema which has the same name than a TABLE in the private schema, then the hint should always use the VIEW from the public schema (instead of saying there's an ambiguity of type 2)? (look at how it works in PostgreSQL : if you don't prefix the table name with the schema name in your SQL query, it assumes you're talking about the item from the current schema, and doesn't say it's ambiguous...)
This would conflict with the idea of getting the non-fk columns out of a junction table. But I tend to agree that it's probably worth more to be able to use those VIEWs properly.
I don't know exactly what can be done and what should be done, but I do believe that there is a good margin for progress here.
Yes, I agree. Thanks for your valuable input on this.
Why not consider instead that as soon as you have a VIEW in the public schema which has the same name than a TABLE in the private schema, then the hint should always use the VIEW from the public schema (instead of saying there's an ambiguity of type 2)? (look at how it works in PostgreSQL : if you don't prefix the table name with the schema name in your SQL query, it assumes you're talking about the item from the current schema, and doesn't say it's ambiguous...)
This would conflict with the idea of getting the non-fk columns out of a junction table. But I tend to agree that it's probably worth more to be able to use those VIEWs properly.
Actually that's wrong. non-fk columns should only ever be taken from the VIEW, because the junction table in the private schema, can not be the target of an embed. So it should also not be possible to select columns from it, even when it's used for an m2m join. So merging those two relationships together might be possible. Not only the name needs to be the same but also the foreign keys in question. Otherwise you could disambiguate with the name of the FK anyway.
I'm not entirely sure, yet, how that interacts with schemas in the db-extra-search-path
though. I will investigate this first.
@Pigeo does you example work correctly with 6.0.2?
@Pigeo does you example work correctly with 6.0.2?
I'm having difficulties to run version 6.0.2. It says :
Attempting to connect to the database...
{"details":"could not connect to server: No such file or directory\n\tIs the server running locally and accepting\n\tconnections on Unix domain socket \"/tmp/.s.PGSQL.5432\"?\n","code":"","message":"Database connection error"}
Whereas version 7.0.0 and 7.0.1 are running fine, with the same /etc/postgrest.conf
configuration file... Has something changed in the syntax of the configuration file between version 6.0.2 and 7.0.0 ??
Here is my configuration file :
server-proxy-uri = "http://secret.url.com:19813"
server-port = 3000
db-schema = "REST_API_v1_0"
db-uri = "@/home/myfile.cred"
db-anon-role = "anonymous"
max-rows = 100
and the myfile.cred
file is a one-liner:
postgres://myuser@/THESOC
The connection problem seems to be related to the fact that I'm using UNIX socket... Does something has changed about that between versions 6.0.2 and 7.0.0 ?
6.0.2 only supports tcp db connection
On 25 Nov 2020, at 20:36, Pierre-Aurélien Georges notifications@github.com wrote:
@Pigeo does you example work correctly with 6.0.2?
I'm having difficulties to run version 6.0.2. It says :
Attempting to connect to the database... {"details":"could not connect to server: No such file or directory\n\tIs the server running locally and accepting\n\tconnections on Unix domain socket \"/tmp/.s.PGSQL.5432\"?\n","code":"","message":"Database connection error"} Whereas version 7.0.0 and 7.0.1 are running fine, with the same /etc/postgrest.conf configuration file... Has something changed in the syntax of the configuration file between version 6.0.2 and 7.0.0 ??
Here is my configuration file :
server-proxy-uri = "http://secret.url.com:19813" server-port = 3000 db-schema = "REST_API_v1_0" db-uri = "@/home/myfile.cred" db-anon-role = "anonymous" max-rows = 100 and the myfile.cred file is a one-liner: postgres://myuser@/THESOC
The connection problem seems to be related to the fact that I'm using UNIX socket... Does something has changed about that between versions 6.0.2 and 7.0.0 ?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
@Pigeo does you example work correctly with 6.0.2?
@ruslantalpa nice catch!
I confirm you that it's working correctly with version 6.0.2 :
/actors?select=films(title)
returns the correct result (and the hint is not needed)
Starting with version 7.0.0 it doesn't work anymore (as explained above in this thread)
things changed in #1401 and #1430, the question is if in 6.0.2 this works simply because the "first" relation was picked and it happens to be the right one or #1401 and #1430 introduced additional "junk" relations, also #1625 introduced even more relations into the mix so things could be very complicated to track down at this stage. without actually looking at the final generated "relations" array.
because the junction table in the private schema, can not be the target of an embed.
I think this is the solution. The junction table should only be searched on the exposed schema and not on the private schema.
I would consider searching the private junction a "leak" in our "schema isolation" approach.
We already have an example of this happening within our fixtures:
GET localhost:3000/users?select=id,name,articles(*)
{
"details": [
{
"cardinality": "m2m",
"origin": "test.users",
"relationship": "private.article_stars[user][article]",
"target": "test.articles"
},
{
"cardinality": "m2m",
"origin": "test.users",
"relationship": "test.articleStars[user][article]",
"target": "test.articles"
},
{
"cardinality": "m2m",
"origin": "test.users",
"relationship": "test.limited_article_stars[user][article]",
"target": "test.articles"
}
],
"hint": "By following the 'details' key, disambiguate the request by changing the url to /origin?select=relationship(*) or /origin?select=target!relationship(*)",
"message": "More than one relationship was found for users and articles"
}
The private.article_stars[user][article]
shouldn't be considered there.
But instead of a disambiguation error, we just get a "could not find any relationship" error. This is the bug.
Weird that that happens on this issue, I'd expect a similar error to the above.
Also, is there value in detecting more than one junction VIEW? I mean, since the extra columns in the junction(like "character" in this issue) are lost anyway.
I think this is the solution. The junction table should only be searched on the exposed schema and not on the private schema.
I would consider searching the private junction a "leak" in our "schema isolation" approach.
I don't know about this one. On a regular m2m embed, the junction table is not really queried. I think there is value in allowing this. There's no point in creating a view to the junction table in the exposed schema, when the junction table should never be queried directly, but just used for m2m embeds. And we might have people already using that and I think we should not make this a breaking change.
On the other side there is no point in keeping a duplicate of the relationship, when we have a view giving us the same base columns. I think we should do the following:
This whole thing is not the bug here, though. This would just be an improvement to throw away useless embedding options, making the choice easier (and unambiguous if there are no others!).
But instead of a disambiguation error, we just get a "could not find any relationship" error. This is the bug.
Weird that that happens on this issue, I'd expect a similar error to the above.
Note that this bug is in fact unrelated to views. Those just expose the bug a lot easier. Do the following, though:
CREATE TABLE exposed.actors (...);
CREATE TABLE exposed.films (...);
CREATE TABLE exposed.personnages (...);
CREATE TABLE private.personnages (...);
So just create two separate tables, both with the same fks and the same name, but obviously only one of them in the exposed schema. The error will be the same "could not find any relationship". Drop one of the tables and it works again.
Doing this, I just found another bug, by the way: Do the same thing but make the exposed.personnages
table just a table with other columns (no fks). So just re-use the same name as the private junction table. This will result in this SQL error:
{
"hint": "There is an entry for table \"personnages\", but it cannot be referenced from this part of the query.",
"details": null,
"code": "42P01",
"message": "invalid reference to FROM-clause entry for table \"personnages\""
}
Also, is there value in detecting more than one junction VIEW? I mean, since the extra columns in the junction(like "character" in this issue) are lost anyway.
As discussed elsewhere, I would really like to find a way to make those columns usable. If we could at least keep this enhancement possible for now by still treating multiple views in the exposed schema that use the same fks separately, that would be cool.
On a regular m2m embed, the junction table is not really queried
The table Is queried, it's referenced in the query and the query should never reference entities that are not in the exposed schema, this goes against the entire security model. If there is no "junction" table in the exposed schema, there is no relation.
we might have people already using that
At least some time ago, (before multi schema support) this was not even possible by mistake, even if somehow the relation leaked, since it was enforced by the search_path, the table would not be found (since it was not qualified by the schema name). This security guarantee/feature was lost and now "set search_path" is largely meaningless (it only works if some function references something unqualified and this usually more of a annoyance then helpful).
The reason we "went looking" for relations in the entire database is because this the only way to detect relations for the views in the "api" schema (by looking at the relations between their source tables), if it was only the tables, the query would have only looked at the tables in the "api" schema.
the bug is probably here https://github.com/PostgREST/postgrest/blob/9adec12a678afceca78f6bfebba236b031e63487/src/PostgREST/DbRequestBuilder.hs#L175 the junction table is not checked to be in the "api" schema the same way the related tables are checked here https://github.com/PostgREST/postgrest/blob/9adec12a678afceca78f6bfebba236b031e63487/src/PostgREST/DbRequestBuilder.hs#L146
Another thing that should be added is a “final stage filter” in the dbstructure code to filter only relations where all the tables in the relation are from “exposed schemas” and also within the same relation, all 3 tables are within the same schema.
This will probably reduce by a lot the size of the “relations” in big schemas and help with lookup speed
I don't know about this one. On a regular m2m embed, the junction table is not really queried.
@wolfgangwalther Also keep in mind that for the m2m embed to work in this case, the jwt role would need a GRANT SELECT on private.junction
. So definitely a "leak", it should be fixed.
The table Is queried, it's referenced in the query and the query should never reference entities that are not in the exposed schema,
Of course the table is part of the query, but no columns are taken from it. In that sense the table is not queried. Maybe "requested" is a better word. The table is not requested from the client-side. It is just used internally to map between two other tables.
this goes against the entire security model. If there is no "junction" table in the exposed schema, there is no relation.
@wolfgangwalther Also keep in mind that for the m2m embed to work in this case, the jwt role would need a
GRANT SELECT on private.junction
. So definitely a "leak", it should be fixed.
Sorry guys, but exactly because of what Steve says, this is NOT a security issue or leak. If the jwt role does not have select privileges on the table (or even USAGE privileges on the schema), then access is denied. So all good.
In general, PostgREST itself can not leak anything - except JWT secrets and stuff like that. Everything else is handled through SQL permissions. Dont' want something to show up or be accessed? Don't grant the privileges! But do not ever rely on PostgREST for security.
With this out of the way, if the consensus is to not allow those "hidden" junction tables, then I am ok with it. For those, for whom it is a breaking change, the fix is simple (SELECT * FROM
view...), it is easier to reason about and probably also easier to do in code then my proposed "merging" solution from above. All good reasons design-wise.
The reason we "went looking" for relations in the entire database is because this the only way to detect relations for the views in the "api" schema (by looking at the relations between their source tables), if it was only the tables, the query would have only looked at the tables in the "api" schema.
Ok, that makes a lot of sense. I can see how, in terms of this approach, that thing is a bug as well. In that case the "breaking change" wouldn't matter, because it's not a feature.
If I'm not mistaken, then the other 2 bugs I mentioned above would also be solved by disallowing hidden m2ms - they just won't come up anymore.
Environment
Description of issue
Trying to reproduce the same database of films than in the documentation, it turns out that resource embedding on views doesn't always work as explained in the documentation (however, when replacing the views with tables, it then works fine and as expected / explained in the documentation)
Steps to reproduce:
The same request than in the documentation doesn't work:
{{baseUrl}}/actors?select=films(title)
returns{"message": "Could not find foreign keys between these entities. No relationship found between actors and films"}
Same problem on the other side :
{{baseUrl}}/films?select=*,actors(name)
returns same error messageOther requests are working fine, such as:
{{baseUrl}}/personnages?select=*,films(*)
or{{baseUrl}}/nominations?select=films(*)
If I replace the views by real tables, PostgREST works as expected and as explained in the documentation (but of course it's not a good idea to directly expose the real tables)