Closed dey4ss closed 3 months ago
What is your take on the “no column references” limitation?
First of all, it is in line withe the standard and pgSQL, which we try to mostly comply to. Second, it also makes sense IMO. foo JOIN bar USING (...)
is comparable to foo NATURAL JOIN bar
(the latter is equivalent to the first when the USING
list contains all columns with the same name in foo
and bar
– USING
is a NATURAL
join where you explicitly specify or limit the used columns). All columns in the list have to be present in both foo
and bar
, so a column reference explicitly referencing a column on only one side is not really useful.
Instead of parsing the list of columns as conjunctions of equality predicates and adding an extra boolean flag to JoinDefinition
, we could also simply have the vector of column names as member, where either this vector or the join condition would be set:
struct JoinDefinition {
JoinDefinition();
virtual ~JoinDefinition();
TableRef* left;
TableRef* right;
Expr* condition; // set for `foo JOIN bar ON foo.x = bar.x`
std::vector<char*>* namedColumns; // set for `foo JOIN bar USING (x)`
// both unset for `foo NATURAL JOIN bar`
JoinType type;
};
This would be a bit more explicit and leaves resolving the column names to a predicate to the DBMS. However, it differs from the way we handled USING
before, and I am not sure what we want to do here in terms of backwards compatibility/divergence from forks.
Feel free to zünd.
This PR allows a list of identifiers (a.k.a column names) when joins name columns with
USING (...)
. This behavior is in line with the SQL standard and PostgreSQL. Furthermore, we add a flag if join conditions come from specifying them directly (foo JOIN bar ON (...)
) or from named columns (foo JOIN bar USING (...)
) so the database can adjust the emitted columns addording to the SQL standard (emit both columns for each predicate inON
, emit a single column for each column inUSING
).For details, see #239.
Fixes #239.