LauJensen / clojureql

ClojureQL is superior SQL integration for Clojure
https://clojureql.sabrecms.com
Eclipse Public License 1.0
285 stars 39 forks source link

Fix position of ORDER BY clauses #35

Closed rheimbuch closed 13 years ago

rheimbuch commented 13 years ago

Was just bit by this while trying to do a 'sort' on a relation like (difference r1 r2). Moved the insertion of order-by clauses to the end of the complete query expression. -ryan

LauJensen commented 13 years ago

Sorry about the long process time for this patch. When I apply it the sqllite integration tests start failing, did you observe this as well?

LauJensen commented 13 years ago

According to the documentation found here http://dev.mysql.com/doc/refman/5.1/en/select.html the ORDER BY clause must precede the LIMIT clause. Could you please provide me with an example of which SQL statement was causing you problems, and which statement fixed that problem?

Thanks

rheimbuch commented 13 years ago

Hi Lau,

Thanks for taking the time to look at this. :) I thought all the tests were passing, but I could very well be wrong. In fact, after more investigation my original patch is a bit naive and shouldn't be pulled in.

That said I still think this is a problem. You are correct that ORDER BY must precede LIMIT. The problem is specifically related to using ORDER BY with UNION (or EXCEPT or INTERSECT) as shown in the Mysql docs at http://dev.mysql.com/doc/refman/5.1/en/union.html.

I'm going to write up a set of tests to demonstrate the problem better, but here's the outline of my current thoughts:

Assuming (def t1 (table :t1)) (def t2 (table :t2))

Sort t1, then union

Given: (union (sort t1 [:t1.sortcol]) t2)

I expect: (SELECT t1.* FROM t1 ORDER BY t1.sortcol asc) UNION (SELECT t2.* FROM t2) But get: SELECT t1.* FROM t1 ORDER BY t1.sortcol asc UNION SELECT t2.* FROM t2 Problem: Without the parentheses, this is a syntax error in Mysql.

Sort t2, then union

Given: (union t1 (sort t2 [:t2.sortcol])) I expect: (SELECT t1.* FROM t1) UNION (SELECT t2.* FROM t2 ORDER BY t2.sortcol asc) But get: SELECT t1.* FROM t1 UNION SELECT t2.* FROM t2 ORDER BY t2.sortcol asc Problem: This is valid syntax, but the ORDER BY is applied to the complete "unioned" expression, not just the second SELECT. Note that without a LIMIT clause, the ORDER BY clauses are effectively discarded because the UNION operation doesn't preserve order. Therefore having the ORDER BY "float" out to affect the entire query is unexpected.

Sort t1 & t2, then union

Given: (union (sort t1 [:t1.sortcol]) (sort t2 [:t2.sortcol])) I expect: (SELECT t1.* FROM t1 ORDER BY t1.sortcol asc) UNION (SELECT t2.* FROM t2 ORDER BY t2.sortcol asc) But get: SELECT t1.* FROM t1 ORDER BY t1.sortcol asc UNION SELECT t2.* FROM t2 ORDER BY t2.sortcol asc Problem: Both problems from the previous two cases: syntax error & unexpected application of ORDER BY to the entire "unioned" set.

Union, then sort

This is the problem that I originally had...

Given: (sort (union t1 t2) [:t2.sortcol]) I expect: (SELECT t1.* FROM t1) UNION (SELECT t2.* FROM t2) ORDER BY t2.sortcol asc But get: SELECT t1.* FROM t1 ORDER BY t2.sortcol asc UNION SELECT t2.* FROM t2 Problem: A syntax error like the first case, but even if it were valid the ORDER BY would apply only to the first SELECT and not the whole expression.

After some more thought, I think there might be a related interaction with take/LIMIT and UNION, since ORDER BY must be applied before LIMIT. In the first 3 examples, the ORDER BY clauses wouldn't have any effect on the result because UNION doesn't preserve order. But if any of the first 3 examples included a "ORDER BY ... LIMIT", then there might be similar issue.

 (SELECT t1.* FROM t1) UNION (SELECT t2.* FROM t2 ORDER BY t2.sortcol asc LIMIT 50)

is a very different query from SELECT t1.* FROM t1 UNION SELECT t2.* FROM t2 ORDER BY t2.sortcol asc LIMIT 50

Phew... hope that made some kind of sense :)

Best wishes, Ryan

eslick commented 13 years ago

FYI - I encountered this today when trying to order the output of group-by aggregation. I swapped the order of the group-by and order-by clauses in the compile function which addressed my particular case.

LauJensen commented 13 years ago

rheimbuch: Fantastic write-up. Your clear distinction between what you expect and what you got has made it very easy for me to get an overview of the problem - Huge thanks! I'll look into this asap.

eslick: Your issue was related to using internal functions instead of the public API. Ive sent you an email with a few examples.

LauJensen commented 13 years ago

rheimbuch: I cant thank you enough for your fantastic write up. Your simple tests gave me the overview I required to fix this somewhat complex problem. You can take a look at this commit https://github.com/LauJensen/clojureql/commit/9bbf4cfcdd312150060f76d600cbdbdd1fcc6b08 and especially the tests to see if I have solved all issues. All the tests pass, so if they are sufficient, we're good to go. If not, please reopen this ticket.

Thanks again! Lau

rheimbuch commented 13 years ago

Hi Lau: This looks great! I'm not seeing anything obvious missing from the tests. Thanks so much for taking the time to look into this :)

Regards, Ryan