VALUES statement in MySQL

irigarae commented 2 months ago

This :values returns invalid mysql syntax (taking inspiration from special syntax :composite example:

(honey.sql/format
 {:select [:*]
  :from [[:table :t1]
         [{:values [[1 2] [3 4]]}
          [:t2 [:composite :col1 :col2]]]]}
 {:dialect :mysql
  :inline true})

it returns

SELECT * FROM (VALUES (1, 2), (3, 4)) AS `t2` (`col1`, `col2`)

however per the official documentation, https://dev.mysql.com/doc/refman/8.0/en/values.html, the correct syntax is

SELECT * FROM (VALUES ROW(1, 2), ROW(3, 4)) AS `t2` (`col1`, `col2`)

p-himik commented 2 months ago

:values itself should probably continue working the original way since MySQL also allows plain VALUES clause that works differently.

Unfortunately, handling of :values doesn't seem to allow you to throw in some extra keywords to make it be formatted as VALUES ROW(...) (only VALUES (ROW(...)) which is invalid), but you can create your own formatter quite trivially.

seancorfield commented 2 months ago

I think I'll need to add :values-row to support this. I'll give it some thought before I choose an approach.

irigarae commented 2 months ago

VALUES is a DML statement introduced in MySQL 8.0.19 which returns a set of one or more rows as a table. In other words, it is a table value constructor which also functions as a standalone SQL statement.

Looking at the official documentation it's used to construct a table, and it contains rows inside. Maybe instead of :values-row (in singular) something like :values-table or :values-rows (notice the plural) might be more appropriate?

And also don't know if there are existing formats like this, but instead of something like {:values-table [[1 2] [3 4]]} maybe :row could be explicit like {:values [[:row 1 2] [:row 3 4]]}? Just leaving ideas, I have no real preference whatsoever.

p-himik commented 2 months ago

Can't use {:values [[:row 1 2] [:row 3 4]]} because right now it's a valid syntax and might be used by other people for other DBs.

seancorfield commented 2 months ago

Re: the name in the DSL, it is standard for HoneySQL to produce SQL keywords that match the DSL keyword (or symbol) so VALUES ROW(1, 2) would come from :values-row (and if there are multiple rows, each of them is still ROW singular).

seancorfield commented 2 months ago

OK, I did some experiments and this seems viable:

(format {:values [:row [1 2] [3 4]]})

with just a couple of tweaks to format-values so that's what I'm going to go with.

I just need to figure out an appropriate set of tests and some documentation :)

seancorfield commented 2 months ago

@irigarae You can try this out with the latest SNAPSHOT or git dep. I'll add docs and tests tomorrow probably and may release a full new version next week...

seancorfield commented 2 months ago

Example added to docs (in Clause Reference) which also acts as a test.

irigarae commented 2 months ago

@irigarae You can try this out with the latest SNAPSHOT or git dep. I'll add docs and tests tomorrow probably and may release a full new version next week...

Don't know if it's intentional, but the test is on #{:row 'rows} (see commit). Don't know if it's intentional to have 'rows or it should've been #{:row 'row}.

$ clj -Sdeps '{:deps {com.github.seancorfield/honeysql {:git/sha "3d48ecac379d6e85535cc94137d4a7115873f1fc"}}}'
user=> (require '[honey.sql])
user=> (honey.sql/format {:select [:*] :from [[{:values [:row [1 2] [3 4]]} :t]]} {:dialect :mysql :inline true})
["SELECT * FROM (VALUES ROW(1, 2), ROW(3, 4)) AS `t`"]
;; this one uses singular `row`
user=> (honey.sql/format '{select [*] from [[{values [row [1 2] [3 4]]} t]]} {:dialect :mysql :inline true})
["SELECT * FROM (VALUES ROW, (1, 2), (3, 4)) AS `t`"]
;; this one uses plural `rows`
user=> (honey.sql/format '{select [*] from [[{values [rows [1 2] [3 4]]} t]]} {:dialect :mysql :inline true})
["SELECT * FROM (VALUES ROW(1, 2), ROW(3, 4)) AS `t`"]

Thanks for the prompt development!

(Off topic) Unrelated, but I’m curious about why you test both the keyword or the symbol. I would’ve expected an initial pass normalizing everything (e.g. all keywords become symbols) and then only test on a single thing. Is there a reason for this design? Maybe performance?

seancorfield commented 2 months ago

Thanks @irigarae -- that's a bug. Fixed now.

The reason is so you can mix'n'match quoted symbolic forms and keyword-based forms, especially when using the helper functions, so you can write whichever is easier for your situation -- or your preference. A quoted symbolic form is less typing and can be easier to read -- it looks more like code -- if you don't have "variables" (values) in it, or you're using named parameters.

irigarae commented 2 months ago

Thanks for everything!

Anyways, I'm not sure I expressed my question clearly. Of course I think it's really useful to mix and match, everybody can use their own preferred style. But what I imagine is that on the parsing/interpretation level it's generally safe to blindly normalize to one thing (e.g., keywords to symbols) and then build all the logic on a single way. E.g., basically the test in this case would be just #{'row}. I wanted to understand why such choice of not normalizing before all the parsing logic. Maybe your response explains that and it's just me that I'm not familiar with the internals. No need to delve deeper!

Just playing with it I found this difference (I can open a bug report if you prefer), which would be avoided if things were normalized somehow:

(sql/format '{select * from t1 join (t2 (:using id)) where (= t1/id 1)} {:dialect :mysql})
;; => ["SELECT * FROM `t1` INNER JOIN `t2` USING (`id`) WHERE `t1`.`id` = ?" 1]
(sql/format '{select * from t1 join (t2 (using id)) where (= t1/id 1)} {:dialect :mysql})
;; => ["SELECT * FROM `t1` INNER JOIN `t2` ON USING(`id`) WHERE `t1`.`id` = ?" 1]

Ok to leave the conversation here, in any case.

seancorfield commented 2 months ago

Efficiency. Normalizing everything first would require rebuilding the whole DSL with (say) just keywords and then parsing the whole DSL. In many places, HoneySQL does convert symbols to keywords as part of parsing, but in some places it's just easier to test against the keyword or symbol without doing the conversion.

And, yes, what you ran into is a bug, so please create an issue for that!

seancorfield commented 2 months ago

Don't worry about creating an issue -- I found and fixed it. There were a couple of other, similar bugs in other places (also fixed).

seancorfield / honeysql

VALUES statement in MySQL #544