Open monacoremo opened 3 years ago
Putting another pending refactor here.
We had an issue where Content-Type
was being considered for GET - invalid since it has no body. This happened mostly because a typing deficiency(still present). More details at https://github.com/PostgREST/postgrest/pull/2168#discussion_r810233307.
unsafeHead
went away on https://github.com/PostgREST/postgrest/pull/2254
Request: purely parse request, at most taking config into account. `parseRequest :: Wai.Request -> Either RequestError Request
Plan: Use db structure to validate and plan the request, including embeds. plan :: DbStructure -> Request -> Either PlanError QueryPlan
To do the above, ApiRequest should remove its dependency on DbStructure - which is needed by the variadic logic introduced on https://github.com/PostgREST/postgrest/pull/1603.
After that findProc and others can be moved to Plan.hs
and ApiRequest
will be much cleaner.
I tried to do this on https://github.com/PostgREST/postgrest/pull/2496 but it turned out difficult bc I'm not familiar with the variadic logic. @wolfgangwalther Would be great if you could help with that.
Ideally the proc logic shouldn't be where the body processing is done. Like on MTUrlEncoded
here
That logic should happen after, on the Plan module.
The challenge is, that ApiRequest
does pass on URL parameters as iPayload
, i.e. payload is either the request body or a json produced from the params. ApiRequest
needs to split this up and have the params as just a [(Text, Text)]
list. Then, Plan
can do the list -> map later, taking the ProcDescription
into account.
I've been thinking about how we transform requests, and seeing the The Internals of PostgreSQL query processing chapter here:
1. Parser
The parser generates a parse tree from an SQL statement in plain text.
2. Analyzer/Analyser
The analyzer/analyser carries out a semantic analysis of a parse tree and generates a query tree.
3. Rewriter
The rewriter transforms a query tree using the rules stored in the rule system if such rules exist.
4. Planner
The planner generates the plan tree that can most effectively be executed from the query tree.
5. Executor
The executor executes the query via accessing the tables and indexes in the order that was created by the plan tree.
It seems:
ApiRequest
is the Parser
Analyzer
is really what we call Plan
now.Rewriter
concept could be used to further split our Plan
module, which is getting big. For example computed relationships
could be thought of as a "rewrite". https://github.com/PostgREST/postgrest/pull/2523 seems like a rewrite too.Planner
per se. This is a further optimization that perhaps could be used to reduce generated prepared statements, e.g. select=id,name
could generate the same query as select=name,id
. Not sure about this.
Executor
doesn't make sense for us.Edit: This won't work because we need to find the function on Haskell code first(with findProc).
To do the above, ApiRequest should remove its dependency on DbStructure - which is needed by the variadic logic introduced on https://github.com/PostgREST/postgrest/pull/1603. The challenge is, that ApiRequest does pass on URL parameters as iPayload, i.e. payload is either the request body or a json produced from the params. ApiRequest needs to split this up and have the params as just a [(Text, Text)] list.
The above continues to be a major challenge and I don't believe it can be done cleanly on Haskell code. It continues to be a blocker for https://github.com/PostgREST/postgrest/pull/1582 every time I tried it.
I believe the only feasible alternative is to parse the URL on SQL code. This would work for variadic and non-variadic payloads:
-- here "a" would be the variadic parameter
WITH
foo AS (
SELECT string_to_array(x, '=') AS key_val FROM string_to_table('a=4&b=3&a=5&c=&d=11&d=15', '&') x
),
foo1 AS (
SELECT key_val[1] as key, to_json(key_val[2]) as val FROM foo
union all
SELECT key_val[1] as key, json_agg(key_val[2]) as val from foo where key_val[1] = 'a' group by key
),
foo2 AS(
SELECT jsonb_object_agg(key, val) as res FROM foo1
)
SELECT * FROM jsonb_to_record((SELECT res FROM foo2 LIMIT 1)) AS (a int[], b int, c text, d int);
a | b | c | d
-------+---+---+----
{4,5} | 3 | | 15
Then we can remove the variadic logic from ApiRequest. Doing this on SQL may have other benefits later as well, it may be possible to customize parsing the input with a custom function.
A higher fidelity example:
SELECT pgrst_call.*
FROM (
WITH
pgrst_url_encoded AS (
SELECT string_to_array(x, '=') AS key_val FROM string_to_table('min=1&max=15', '&') x
),
pgrst_url_encoded_repeated AS (
SELECT key_val[1] as key, to_json(key_val[2]) as val FROM pgrst_url_encoded
union all
SELECT key_val[1] as key, json_agg(key_val[2]) as val from pgrst_url_encoded where key_val[1] = 'a' group by key
)
SELECT jsonb_object_agg(key, val) as val FROM pgrst_url_encoded_repeated
) pgrst_payload,
LATERAL (
SELECT * FROM jsonb_to_record(pgrst_payload.val) AS _("min" int, "max" int)
) pgrst_body,
LATERAL "test"."getitemrange"("min" := pgrst_body.min, "max" := pgrst_body.max) pgrst_call;
Continuing the above idea. Our App.hs
module can be equated to the TrafficCop
in PostgreSQL https://wiki.postgresql.org/wiki/Backend_flowchart#Main_Query_Flow
These ideas can be used to structure our modules and to improve the ARCHITECTURE.md.
There are still some pending refactors but overall the structure now is in line with the original comment. Will unpin this issue as it's not urgent anymore. We can now grow the codebase in a more maintainable manner.
Congratulations — great job!
- help with the onboarding of new contributors
As a (current) non-contributer I'm willing to help test this.
With #1793, we've started the process of streamlining the codebase. This is a purely internal concern that will yield no additional features. It will, however:
Next steps include (in no particular order and to be completed - please feel free to edit)
CLI
andConfig
modules (#1806)main/Main.hs
, minimizing code under CPP conditional compilation (#1814)plan :: DbStructure -> Request -> Either PlanError QueryPlan
reponse :: Plan -> QueryResult -> Wai.Response
Request
modules behind one API (#1831)Query
modules behind on API (#1831, #2488)Response
concerns fromApp
, i.a. (#1831, #2484)Error
type alongRequest
,Query
andResponse
GHC.Show
instances for string conversion (#2016)Protolude.Conv
,unsafeHead
etc.)Protolude.Conv.toS
(#2017)decodeUtf8
et. alunsafeHead
doctest
for unit testing within modules where needed (#2019)ARCHITECTURE.md
or similar to guide newcomers to the codebase, explaining how the major modules fit togetherAppState
as a monad (tbd)Other: