google / zetasql

ZetaSQL - Analyzer Framework for SQL
Apache License 2.0
2.28k stars 214 forks source link

Formatting issue with table names #128

Open kollerdaniel opened 1 year ago

kollerdaniel commented 1 year ago

Hi,

I use ZetaSQL for syntax checking, formatting checking and file formatting. While using the tool I ran into an issue that if I use table names like this:

table-name-example

this will be the result of the formatting:

table - name - example

My formatter configuration file looks like this:

allowInvalidTokens: false
capitalizeKeywords: false
enforceSingleQuotes: false
indentationSpaces: 4
expandFormatRanges: false
lineLengthLimit: 100
preserveLineBreaks: false

I read the documentation about the Lexical structure and syntax and it says that I can use unquoted identifiers with single dashes. Therefore I do not understand the reason for the issue. Is there a solution to the issue, or a configuration that can be set to avoid this?

a-litvinov commented 1 year ago

Hi Daniel,

Thanks for reporting the issue! Unfortunately, it is a known limitation of the formatter. The problem is that formatter doesn't use the ZetaSQL parser to understand the meaning of each token - it relies on a very simplified heuristics set, that is quite lenient and allows formatting something that wouldn't even parse (which is quite common situation in SQL world, think, e.g., about macros). Basically, formatter needs to know all locations in grammar where table name is expected to make a distinction between a dashed identifier and a math expression.

The only workaround for now is to escape the table names that contain dashes with backticks `, even though, ZetaSQL standard allows using the names unescaped.

kollerdaniel commented 1 year ago

@a-litvinov thanks for your reply!