Closed jasonmp85 closed 9 years ago
We discussed about modifying the code to make it closer to postgres_fdw, which means we break up quals into 'local' and 'remote'. The remote query would then pull columns for the local quals, and execute the remote quals entirely. This makes the logic cleaner, and sets it up for handling volatile/mutable functions. To begin with, we expect the local quals to be NIL as we push all of them down currently.
We also discussed using a mutator for modifying the FromExpr for the local quals, vs setting the quals to NULL. We decided on taking the simpler approach now, acknowledging that this may need to be revisited with subselect handling.
I haven’t written the ClassifyRestrictions
function yet, though at the moment it will trivially put all quals in the remote list and leave the local list empty. So all the functionality is there to use these variables (even if the local quals eventually have something to do), but that function is absent (as is documentation).
I didn’t want to change PlanSequentialScan’s parameter list as they’re analogous to the ordering of params in other plan*
functions in PostgreSQL.
Finally, I want to point out that our old code had weird requirements around what received the original query
and what received distributedQuery
, and in fact certain lists and objects build using distributedQuery
were later used with query
to build the sequential scan, which felt weird. I retained all those requirements, as they're actually necessary, but it bears calling out that distributedQuery
has had mutations performed on it by standard_planner
which means the final localQuery
is a bit of an amalgam of computed and raw things. Again, this is preserving existing behavior of pg_shard
but I wanted to mention it.
Responded to these pieces of feedback:
errhidestmt
query->jointree
is never NULL
BuildLocalQuery
PlanSequentialScan
modifies its query
argumentClassifyRestrictions
functionNeeding further feedback:
local
/remote
naming, as it's somewhat pervasivelog_distributed_statements
are you? I like it and find alternatives less clearcopyObject
question? I added comments clarifying that one function is returning a copy and another will scribble on input, but I don't see any other solution without adding an extraneous copy or adding too much responsibility to PlanSequentialScan
. What was your suggestion?ClassifyRestrictions
. It's naïve for now but I'll fill it in as part of #47WHERE
clause? Couldn't find clear precedent in Citus…(And as mentioned, I'll modify function declaration/definition order as a last step).
I think restriction
is OK for now. We already have a function called QueryRestrictList, and it has precedence in Postgres. I think we've used whereClause or selectClause in Citus previously.
Reworks multi-shard SELECT so instead of performing quals both remotely and on the local sequential scan, they are only performed remotely. This frees us up to avoid pulling columns mentioned only in quals over the wire, which provides users more control over how much data gets pulled in a multi-shard query.
fixes #33