Closed viiicky closed 4 years ago
I'm not sure whether your approach is really the best way to go about this, but personally I would love to have this for debugging purposes. If I could easily have the query returned instead of executed, that might help development a lot. One way to do this would be, by just making the same request with a different Accept
header. So instead of Accept: application/json
, you could do something like Accept: application/sql
and get the query-to-be-run.
Now, not really answering your question: Have you considered putting those CTEs, GROUP BY
s and aggregates into VIEW
s or FUNCTION
s and then query those through PostgREST? This way you would still get the PostgREST filters for free. What kind of problems / limitations would you face when doing it this way?
@wolfgangwalther for debugging (locally) just log all queries and look at the database logs, no need for special endpoints for this
@wolfgangwalther for debugging (locally) just log all queries and look at the database logs, no need for special endpoints for this
Sure, that's what I am doing right now. But sometimes you would first need to create that local environment to replicate. If I face something weird in production (or on a server where I don't have access to the database logs...), instead of manually replicating the environment and issue first, I could immediately repeat the same query with the other header and get the query. I guess this could also improve bug reports, if people can easily provide the executed query in the issue as well.
Have you considered putting those CTEs,
GROUP BY
s and aggregates intoVIEW
s orFUNCTION
s and then query those through PostgREST? This way you would still get the PostgREST filters for free. What kind of problems / limitations would you face when doing it this way?
Hmmm, in order to keep the question concise and to the point, I simplified it too much @wolfgangwalther . The actual use-case is much complex, and the view approach would not work out. In a nutshell, we have to create dynamic SQL based on user input. It is this dynamic SQL which I was talking about in my earlier response that becomes complex because of group by
s, aggregates, cte
s etc. What cte
(s) to create, what dimensions to have in group by
s, what measures to aggregate - everything gets decided based on client request.
Good news is, have written all of this already, the only part left is the "where" clause, which instead of writing on my own, I am hoping to re-use from this library.
Also @wolfgangwalther , do you have any inputs on which place to look into for going the Accept
header way? With the help of @ruslantalpa , I have already managed to get the query out, but I am now more inclined to identify such request based on the header instead of the prefix query
in the path.
I will keep this thread updated if I find something from my side before you reply back. Also, I am new to Haskell, so please excuse any wrong terminologies used by me. Thanks.
Hmmm, in order to keep the question concise and to the point, I simplified it too much @wolfgangwalther . The actual use-case is much complex, and the view approach would not work out. In a nutshell, we have to create dynamic SQL based on user input. It is this dynamic SQL which I was talking about in my earlier response that becomes complex because of
group by
s, aggregates,cte
s etc. Whatcte
(s) to create, what dimensions to have ingroup by
s, what measures to aggregate - everything gets decided based on client request.Good news is, have written all of this already, the only part left is the "where" clause, which instead of writing on my own, I am hoping to re-use from this library.
I guessed you were going to say something like that. In theory, that's nothing (yet) that wouldn't be possible with an RPC/FUNCTION
as well. Of course, this would be limited, if:
you would need to have those filters on the source table and not on the result (pushing down those conditions into the function is not possible)
you don't know the possibly returned columns in advance (you could return JSON
from the function, but couldn't use filtering properly anymore)
Still, there's one fundamental problem I see: If we were to introduce either a query
endpoint or the same thing through Accept
headers, I think it would be reasonable to expect that this request should throw errors for requests, where you e.g. name some columns to filter on that don't exist. It wouldn't make much sense for me, if those requests would then return the wrong SQL query without validating it at all (e.g. the explain query could still be run, so that postgres still parses the query).
Now if the query is parsed indeed - you will need to create a table/view (endpoint) that has all the possible columns names you might want to filter on, to be able to get those queries you are looking for. Not sure whether that's possible in your case.
Also @wolfgangwalther , do you have any inputs on which place to look into for going the
Accept
header way? With the help of @ruslantalpa , I have already managed to get the query out, but I am now more inclined to identify such request based on the header instead of the prefixquery
in the path.
I'm assuming you're getting the query somewhere around here?
It's already stm
and contentType
on the same line, so that should be good starting point for both, I guess.
I will keep this thread updated if I find something from my side before you reply back. Also, I am new to Haskell, so please excuse any wrong terminologies used by me. Thanks.
I am new as well :). @steve-chavez will be able to help you much better.
@viiicky Have you tried creating a stored procedure that does dynamic SQL?
Here's an example: https://github.com/PostgREST/postgrest/issues/915#issuecomment-553188824. With that you can reuse all of the PostgREST filters(including the WHERE part),.
For example, below request to PostgREST: GET /query/people?amount=gte.1000&bill=eq.true&date=gte.2020-06-30T18:30:00.000Z&state=in.(DONE)&everified=eq.true&limit=10&offset=0&or=(status.ilike.ACTIVE)&total_amount=gte.2000 HTTP/1.1 should return: select * from people where amount >= 1000 and bill is true and date >= '2020-06-30T18:30:00.000Z' and state in ('DONE') and verified is true and (status ilike '%ACTIVE%') and total_amount >= 2000 offset 0 limit 10
Still, there's one fundamental problem I see: If we were to introduce either a query endpoint or the same thing through Accept headers,
One problem with this direction, the query structure might change later when we parametrize the values. The WHERE part could change to: where amount >= $1 and bill is $2 and date >= $3 and state in ($4) and total_amount >= $5
. I'm guessing that would not be helpful for reusing the query in another client unless we somehow expose the parametrized values.
@wolfgangwalther @steve-chavez yes, I have written a bunch of postgres function only - that helps first create the dynamic SQL and then execute it. I cannot use the /rpc
option by PostgRest though as it requires POST requests.
btw, I just finished up my requirement. This unblocks me! Here is the diff: https://github.com/PostgREST/postgrest/compare/master...viiicky:master in case you guys are interested @ruslantalpa @wolfgangwalther @steve-chavez
Let me know if you guys have any comments. Also, let me know if you think this feature can make it to upstream, in which case, happy to contribute after fixing comments, if any.
This is how it works: Usual requests, nothing changes:
% curl http://localhost:3000/todos -i
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Date: Thu, 20 Aug 2020 06:27:37 GMT
Server: postgrest/7.0.1 (dfa1fca)
Content-Type: application/json; charset=utf-8
Content-Range: 0-1/*
Content-Location: /todos
[{"id":1,"done":false,"task":"finish tutorial 0","due":null},
{"id":2,"done":false,"task":"pat self on back","due":null}]%
Same request with the new header:
% curl http://localhost:3000/todos -i -H "Accept: application/sql"
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Date: Thu, 20 Aug 2020 06:27:45 GMT
Server: postgrest/7.0.1 (dfa1fca)
Content-Type: application/sql; charset=utf-8
SELECT "api"."todos".* FROM "api"."todos" %
Have of course added this header support for just ActionRead
kind of Action
.
I think it would be reasonable to expect that this request should throw errors for requests, where you e.g. name some columns to filter on that don't exist. It wouldn't make much sense for me, if those requests would then return the wrong SQL query without validating it at all
@wolfgangwalther I did not solve for this problem, as this is not super-worrying for my use-case.
I'm assuming you're getting the query somewhere around here?
It's already
stm
andcontentType
on the same line, so that should be good starting point for both, I guess.
This was helpful @wolfgangwalther . Thanks for the direction.
One problem with this direction, the query structure might change later when we parametrize the values. The WHERE part could change to:
where amount >= $1 and bill is $2 and date >= $3 and state in ($4) and total_amount >= $5
. I'm guessing that would not be helpful for reusing the query in another client unless we somehow expose the parametrized values.
Yes, parameterized query isn't going to be helpful. But, now that, I understand the area, and how it is working, I am good to go. Can always write my own stuff.
Also, since my issue is resolved, let me know if I should be closing this issue now. Thank you all /\
@wolfgangwalther @steve-chavez yes, I have written a bunch of postgres function only - that helps first create the dynamic SQL and then execute it. I cannot use the
/rpc
option by PostgRest though as it requires POST requests.
That's not true. As long as you're not mutating the database, you can call it with GET
: http://postgrest.org/en/latest/api.html#immutable-and-stable-functions
Hmm.. I still think that the use case could have been solved by dynamic SQL + func(as Wolfgang mentions, it also works with GET).
That being said, besides the SQL query, sometime ago I had the request to expose the metadata for how pgrst does the JOINs on resource embedding(see Embedding disambiguation), this currently only gets exposed when there's an error.
So for now I'm not sure if it's worth to expose the SQL or other information.
Also wanted to mention that application/sql is indeed a IANA media type: https://tools.ietf.org/html/rfc6922.
Also wanted to mention that application/sql is indeed a IANA media type: https://tools.ietf.org/html/rfc6922.
Not a coincidence - I looked that up before suggesting ;)
That being said, besides the SQL query, sometime ago I had the request to expose the metadata for how pgrst does the JOINs on resource embedding(see Embedding disambiguation), this currently only gets exposed when there's an error.
I can imagine this being very helpful. It feels like this could be in the OpenAPI output as well. Couldn't find much on that, only that such relations seem not to be a part of the OpenAPI spec currently: https://github.com/OAI/OpenAPI-Specification/issues/1646
If not in the OpenAPI, the OPTIONS
request method comes immediately to mind... seems like quite a match.
So for now I'm not sure if it's worth to expose the SQL or other information.
Same here.
Still, some thoughts, if we were to implement this. If we want to do this for modifying statements (INSERT/UPDATE/DELETE) and RPC as well - just returning the SQL with 2xx
status code would imply that the request was successful and the changes were made. So maybe it should just be a status code in the 4xx
range, which would indicate, that the request was not actually executed. Not sure which status code exactly, though. 418 maybe :D.
Another approach could be to use the http method TRACE
to do this. If we think of postgrest as just a proxy between the client and the postgres server, the proper response to a TRACE
request from postgres should be the query (= the "SQL request"). We could probably pass the http method to create the statement for (GET
, POST
, ...) via custom header. Not sure about the request body. The query for POST
and PATCH
changes with a different request body, but the RFC explicitely states, that a TRACE
request should not have one...:
https://tools.ietf.org/html/rfc7231#section-4.3.8
Implementing TRACE
in general with the Max-Fowards
header to return the http request received by postgrest, could be helpful in a proxy setting as well.
Okay folks - as I am no longer blocked, I am closing this off from my side. Feel free to re-open this if you guys want to continue the discussion or if it's revisited in future.
Thanks to all of you for quick responses and directions to the code-base. It was extremely helpful. :party_parrot: Adios.
I've noted that viiicky/postgrest only gives the main SQL query(SELECT .. FROM .. WHERE
) and not the full wrapper SQL query(with the json_agg
part). Still looks useful though.
Yeah, I just needed the main SQL query for my need, and I figured out we don't need this feature in the near future in the upstream, so kept it custom to my reqs.
Also, I have already released the docker image for this. In case future readers find it helpful: https://hub.docker.com/r/fylehq/postgrest
Environment
Feature request
We use the above-mentioned postgrest docker image with no problems.
Now, we need to write a new web service. We have decided not to use PostgREST in this new service because most of the SQL generated by this service will be of complex nature, including lots of CTEs, group-bys, aggregate functions etc. We, however, wish to keep the API contract of this service similar to that of PostgREST.
We are thinking to reuse the "filters" part of the SQL generated by PostgREST somehow. That way, we can keep the contract for query params of APIs of this new service same as PostgREST. Thus, if PostgREST exposes an endpoint, where I can make a request with these query params and get the SQL generated, I can build my complex SQL on top of this SQL without worrying about the "where" part of the SQL.
For example, below request to PostgREST:
GET /query/people?amount=gte.1000&bill=eq.true&date=gte.2020-06-30T18:30:00.000Z&state=in.(DONE)&everified=eq.true&limit=10&offset=0&or=(status.ilike.*ACTIVE*)&total_amount=gte.2000 HTTP/1.1
should return:select * from people where amount >= 1000 and bill is true and date >= '2020-06-30T18:30:00.000Z' and state in ('DONE') and verified is true and (status ilike '%ACTIVE%') and total_amount >= 2000 offset 0 limit 10
I am going to go through the code, and based on my findings, will put my suggestions here on how to go about it. Right now struggling to setup Haskell dev setup - never worked on it.
Any directions on which part of the code I should be looking at would be helpful. Thanks.