Adding support for delimited identifiers

LauJensen / clojureql

ClojureQL is superior SQL integration for Clojure

https://clojureql.sabrecms.com

Eclipse Public License 1.0

284 stars 39 forks source link

Adding support for delimited identifiers #51

Open budu opened 13 years ago

budu commented 13 years ago

While working on the original ClojureQL, I came across the dreaded delimited identifiers concept of SQL. With the new CQL, we currently can't access any SQL object that require delimited identifiers. After looking at how to handle that feature with the new code I'd like to open a discussion about it before trying to solve this issue.

Here's a explanation of the issue between different implementations: http://bit.ly/eHdIQP. Overall, there's only two DBMS which are causing problems, MySQL (which use backquotes) and SQL Server (which use brackets). Both have settings to use double-quotes.

The simplest solution I can think of for now would be to make the default compile method to use double-quote and to only set standard like behavior in custom compile methods for the faulty implementations which then would call the default one. We could add an option to db-spec to specify if we want to use delimited identifiers.

Any suggestions?

r0man commented 13 years ago

Looking at the activerecord implementation (the connection adapters) I think it's a wise idea to split this functionality into different compilers for each database vendor. The various "quote_*" methods handle many edge cases ...

MySQL

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/mysql_adapter.rb#L229

PostgreSQL

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L319

budu commented 13 years ago

I've find out an even simpler way to handle this problem. JDBC's DatabaseMetaData class have a method called getIdentifierQuoteString that returns the character used or " " if delimited identifiers are not supported.

budu commented 13 years ago

Started working on it (https://github.com/budu/clojureql/tree/quoted-identifiers) but I now realized this is much more work than I thought at first. While working on it I felt the strong need to restructure the compiler code to make it more modular because in its current state it is very difficult to work with. I'll push that other branch later once I get something working. Hopefully that will help me understand the code better.

budu commented 13 years ago

After discussing with Lau about this issue, we think it's better to add this feature before reworking the compiler. So we'll concentrate on getting the to-* functions to support delimited identifiers directly. One other important point is how would we make that feature optional and which is the right default behavior.

I can think of to ways of controlling that option. Adding a new option to the db-spec map, this would require to pass that option to the to-* internal functions. Else we could add a global var to the internal namespace which could be set by a core function. We could also provide a combination of the two.

I think the most common use of SQL is with legal identifiers, thus we would make this feature disabled by default.

LauJensen commented 13 years ago

Just before a statement is executed we have a vector where the first arg is essentially SQL keywords #{SELECT WHERE FROM DISTINCT LIMIT OFFSET} etc, and everything else is either a ? or a column spec. I think its possible to apply the delimiters at this stage this totally avoiding the to* internal functions.

budu commented 13 years ago

That's a good idea, which I've leveraged to include check constraint support in Lobos using ClojureQL predicates by extracting the keyword list beforehand (see https://github.com/budu/lobos/blob/master/src/lobos/compiler.clj#L144). There's still the function calls to take into account. Overall this technique feels brittle but I'll consider it before doing further work on this.

budu commented 13 years ago

I've gave up for the third time today. This improvement end up being much more work than I ever though, even with the last implementation idea. This kind of modification would really benefit from having an AST and not relying on contrib for insert/update, anything else feels really too hackish to my taste.

daaku commented 12 years ago

This would be pretty awesome, because it would allow hyphens as well as asterisk in the names allowing for more clojuresque names like user-images or user-id.