Closed morvael closed 3 years ago
No doubt system
became reserved word in 4.2, but I think the list of reserved words varies between databases and having one big set of reserved words from any and all databases is leading to such problems. Too many words are reserved because of that.
Sadly our database (Informix) understands only
b1.'system'
b1."system"
does not work on informix?
Anyway, system
is likely a reserved keyword because of Oracle's ALTER SYSTEM ...
statement.
Ok, b1."system" works, so this is a workaround. But as I wrote, there should be a notion of dialects with different sets of reserved words, instead of one big set covering all databases. Because that is very restrictive. And with every version of JSqlParser that adds new reserved keywords all queries passing through the parser must be examined if they don't contain words that were OK, but suddenly aren't OK anymore...
Looking at your fix, I had column named system, not table.
SELECT b1.system FROM table b1
No a fan of workaround, as columName in parsed statement retains those quote chars, so some other code using that must be aware of that possibility and add conditional stripping...
Looking at your fix, I had column named system, not table.
Well, that is an example why verbose samples and test cases are important. This works now:
SELECT system from b1.system
Haven't seen such query yet, but ok :)
Haven't seen such query yet, but ok :)
Have not seen system
used as a column name or alias without proper quoting either -- so should I reconsider :-D
Btw, there is a comprehensive list of SQL 99 Keywords. You can find it on the internet. I can only advise to avoid any of these keywords and to apply proper quoting to all table- and column names regardless. Furthermore it is good practice to write RDBMS agnostic, standard compliant queries -- unless the RDBMS itself forces you to use a certain non-compliant syntax (yes, I mean you, Oracle!)
Yes, but it's legacy system still in use, and that field exists for longer than I work with it... Will not change it just to remove potential collision :)
And one more more thing: I believe most of the contributors know Oracle or MS SQL Server, but few use Informix.
So if you have a catalog of your own preferred Informix queries and would like to contribute it, we could actually integrate it into our unit tests and assure that nothing breaks during the development.
Have you considered handling dialects (different sets of reserved words depending on selected database engine)? I know it may be a bit problematic because of static construction of the parser definition...
As I thought, SYSTEM is not reserved in the core SQL-92 standard.
As I thought, SYSTEM is not reserved in the core SQL-92 standard.
I wrote SQL99 for a reason. https://www.drupal.org/docs/develop/coding-standards/list-of-sql-reserved-words
Yes, there were a few discussions around, but no really good solution emerged so far. More importantly, in your case even a perfect solution would not have helped much because (to my best knowledge) nobody uses or knows Informix -- so who would write the dialect specific grammar?
Well, so far we have only encountered problems with EXECUTE FUNCTION
and EXECUTE PROCEDURE
syntax, and fully qualified name syntax database@server:owner.table
( https://www.ibm.com/docs/en/informix-servers/14.10?topic=database-specifying-object-in-cross-server-query ), where owner is optional. There are also some simple unrecognized function names, like CURRENT (if I'm not mistaken). We try to work around these problems, but "native" support would be great.
A good idea would be to somehow make the parser more dynamic, where I could register/deregister keywords at will (or map them to be interpreted as some other keywords), including support for "raw sql part" which the parser doesn't try to fit into object model, but just stores it raw and appends when needed. That way I could quickly configure our instance of parser to do what I need it to do, without you having any Informix experts on your side.
Unfortunately, this is not how the JavaCC Parser generator works. As keyword is not a keyword because we wanted to make life difficult. As keyword is actually a token which tells the parser generator to start a certain sequence.
Oracle's 'ALTER SYSTEM' will enforce 'SYSTEM' as keyword, unless we created a different Grammar without the 'ALTER SYSTEM' production.
In my opinion, we would need:
1) a Template Engine and a Grammar Template, which will generate the Java CC Grammar for a certain RDBMS (based on paramaters) 2) then build the Parser dynamically and "on the fly"
This comes at a high cost though: a) we would need to deploy the template engine and the JavaCC library b) the performance would suffer, at least regarding the initiation time (building the template and the Parser) c) it would need the Java Compiler and run only with a JDK (but not with a JRE)
Big effort for small benefits.
Yes, wrote my share of Antlr4 grammars, so I know how it rolls, and how static (while yet dynamic) it is. So the only other option is to pester you forever with small(ish) issues on github, to add them manually each time there's a problem :)
Too bad it isn't possible for SYSTEM to be keyword only after ALTER, not in some other context (I'd know how to solve this in Antlr4).
Too bad it isn't possible for SYSTEM to be keyword only after ALTER, not in some other context (I'd know how to solve this in Antlr4).
That's exactly the point: JavaCC works with Tokens
and ALTER SYSTEM
is actually a sequence of 2 tokens.
When we define now Tablename
, we need to define all allowed Tokens and since SYSTEM
is a Token on its own already, we need to allow it explicitly.
I would not have thought, that anybody would use SYSTEM
as a qualified name anywhere and for that reason I have not enabled it as a qualified name, when implementing ALTER SYSTEM
.
Until someone reports, that indeed SYSTEM
is used as a column or table name.
Then we add it and try, if everything else still works. If so, then good. If something else breaks, we turn it down.
Too bad it isn't possible for SYSTEM to be keyword only after ALTER, not in some other context (I'd know how to solve this in Antlr4).
On ANTLR4, I believe it is more flexible regarding the grammar, but the generated parser is less performant. (Although I have no proof and might be simply wrong on that assumption.)
Yeah, you can try to move some parts from lexer to parser level in antlr4, but following good practice of defining every token for lexer you end up with the same problem, where you have to add them manually to various parser rules. Maybe just include keywords as allowed table/column names by default, if they are not in the basic SQL-92 standard, but come from some extension or one specific database.
Sort of making it backwards compatible by default - adding new keyword? Fine, just add it as allowed table/column name. Because someone might have been using it like that, when it wasn't keyword yet.
This worked in 4.1, but doesn't work in 4.2:
Parser works for
Sadly our database (Informix) understands only
which in turn isn't something the parser understands.