antlr / grammars-v4

Grammars written for ANTLR v4; expectation that the grammars are free of actions.
MIT License
10.24k stars 3.72k forks source link

[postgresql] Fix for #4309, #4315, #4318. #4319

Closed kaby76 closed 1 week ago

kaby76 commented 1 week ago

This is a fix for #4309, #4315, and #4318.

This change corrects a lot of issues with the PostgreSQL grammar. The overall issue was that there were still a number of pl/pgsql grammar rules in this grammar even after https://github.com/antlr/grammars-v4/pull/4316. As I mentioned, adding the grammar for pl/pgsql directly into the PostgreSQL grammar causes a lot of problems. It is very important to learn from this: you should not merge in the grammar for pl/pgsql into PostgreSQL, unless you are very, very careful. For example, you would have to have every lexer by default work in a different lexer mode. The official parser for PostgreSQL does not try to combine grammars. Why should the Antlr grammar deviate from this?

The changes bring the grammar to be back in line with the official PostgreSQL grammar gram.y. All the tests pass, but more importantly, the ambiguities that I mentioned are now gone. This is because the official Bison grammar is LALR(1), so it shouldn't be ambiguous.

It's not clear if I still have all the pl/pgsql grammar removed from the postgresql grammar. There was one test that uses embedded SQL identifiers, but it shouldn't. So PLSQLVARIABLENAME is still defined and used.

I noticed that the lexer grammar contained numerous "built-in" function names. These were added erroneously. I added the comments from gram.y for the identifier classes into the parser grammar.

kaby76 commented 1 week ago

(Need to resolve conflicts once https://github.com/antlr/grammars-v4/pull/4316 is merged.)

teverett commented 1 week ago

@kaby76 thanks!