seancorfield / honeysql

Turn Clojure data structures into SQL
https://cljdoc.org/d/com.github.seancorfield/honeysql/CURRENT
1.78k stars 174 forks source link

The `:join` special syntax does not allow specifying an alias for the leftmost expression #529

Closed p-himik closed 6 months ago

p-himik commented 6 months ago

The original Slack thread: https://clojurians.slack.com/archives/C66EM8D5H/p1715631198472399

The desired query:

SELECT *
FROM foo
LEFT JOIN (populations pm INNER JOIN customers pc ON pm.id = pc.id AND pm.other_id = pc.other_id)
ON foo.fk_id = pm.id

My attempt at it:

(sql/format {:select    :*
             :from      :foo
             :left-join [[[:join [:populatons :pm]
                           {:join [[:customers :pc]
                                   [:and
                                    [:= :pm/id :pc/id]
                                    [:= :pm/other-id :pc/other-id]]]}]]
                         [:= :foo/fk-id :pm/id]]})
=>
["SELECT * FROM foo LEFT JOIN (POPULATONS(pm) INNER JOIN customers AS pc ON (pm.id = pc.id) AND (pm.other_id = pc.other_id)) ON foo.fk_id = pm.id"]

The special syntax required for that works just fine, except for the populations pm part. If you were to use [:populations :pm] it'd be formatted as an expression and thus would become populations(pm).

There's a special :alias syntax that sounds suitable, but it doesn't work either as it expects only a single argument.

seancorfield commented 6 months ago

The special :join syntax was introduced in #483 so I'll have to see why I implemented it as format-expr in the first place, since that precludes using an alias without forcing additional nesting to get the expression semantics.

seancorfield commented 6 months ago

Looking at the syntax diagrams in a few database's docs, it looks like format-selectable-dsl would be better here than format-expr and there are no expression examples given, nor tests in HoneySQL, so changing this shouldn't break anyone's sanctioned code... And clearly the expectation is that [:populations :pm] should work (which a selectable DSL formatter would do), and therefore an expression would need an additional level of [ .. ] just as in from and select and so on.

p-himik commented 6 months ago

I think that that syntax is largely to avoid regrouping items. Maybe @NoahTheDuke can correct me, but after skimming the docs it seems that an alternative way to write the very same query would be:

SELECT *
FROM populations pm
INNER JOIN customers pc ON pm.id = pc.id AND pm.other_id = pc.other_id
RIGHT JOIN foo ON foo.fk_id = pm.id

With that in mind, maybe the whole special-case :join is unnecessary if (...)-grouping can be achieved at the e.g. :join-by level.

A somewhat related perspective pointed out by those docs mentioned in #483:

A JOIN clause combines two FROM items

So, at least conceptually and for PostgreSQL, :join seems to logically belong to :from. Of course, that ship has long sailed, but maybe extending :from is also an option.