heavyai / heavyai-jdbc

A JDBC driver for connecting to an HeavyAI GPU database and running queries.
https://www.heavy.ai/
Other
9 stars 16 forks source link

{fn timestampadd} gets translated to {fn dateadd} #2

Open mlazatinph opened 5 years ago

mlazatinph commented 5 years ago

CalciteAdapter.cpp replaces timestampadd with dateadd.

However, in cases where timestampadd(...) is used as a jdbc espace function, as in "{fn timestampadd(...)}", CalciteAdapter should also remove the enclosing "{fn" and "}".

Otherwise, we are getting an error saying that "{fn DATEADD}" is not supported.

cdessanti commented 5 years ago

I already did such rewrites to get rid of fn and ts jdbc driver level, so I could do a fast pull request to manage this

mlazatinph commented 5 years ago

I've actually modified fnReplace(String sql) in OmniSciStatement.java as well to handle this, but my solution is based purely on regular expressions. It is good enough for my current use case, but I know it can break. It is susceptible to errors, e.g. when the SQL is written in a way that may break the regex pattern. Most of the time though, it is not the case, even when {fn timestampadd} is nested. Still, I am not confident to share my code because I know it can break.

The only proper solution I can think of is writing a full-blown SQL parser that breaks the SQL into tokens, goes thru the nested structure of the SQL, then remove the tokens {fn and }. We also have to take into account that these tokens can be written inside string literals, in which case, they should not be removed.

Having said that, this gives me the idea that the proper-proper solution would be to remove the apply_shim in CalciteAdapter.cpp, do not re-write fnReplace(String sql) in OmniSciStatement.java, and instead handle this in the Query parsing engine (I have not gone that deep yet).

cdessanti commented 5 years ago

I did the same way to make omniscidb work with JDBC generic data sources in Tableau. As you said is a little dangerous, but you can do other rewrites to improve the performances on time-based queries. Performance wise is better to have a single date_trunc than a complex expression to get a date at the start of the month or at the start of the week. We could add a parameter in the jdbc driver that disable those new rewrites. I already wrote the code, but to make it works some constructors of classes have to be changed