Open Ngalstyan4 opened 1 year ago
I tried adding sqlfluff
to Dockerfile.dev
with
pip install sqlfluff
and to CMakeLists.txt
with
# Find all SQL files in your project
file(GLOB_RECURSE SQL_FILES ${CMAKE_SOURCE_DIR}/*.sql)
# Add custom target to lint SQL files
add_custom_target(
lint-sql
COMMAND sqlfluff lint ${SQL_FILES} --dialect postgres
COMMENT "Linting SQL files with SQLFluff"
)
# Add custom target to format SQL files
add_custom_target(
format-sql
COMMAND sqlfluff fix --force ${SQL_FILES} --dialect postgres
COMMENT "Formatting SQL files with SQLFluff"
)
I was able to run it with make format-sql
, but there's a lot of issues from our custom operator, such as:
L: 45 | P: 21 | PRS | Line 45, Position 21: Found unparsable section: '<->
| array[0,1,0] LIMIT 7'
I think pgFormatter
will be a better fit, since there is also pl/pgsql code.
I'm stopping work on this. Some comments for visibility.
The overall reason is that there isn't sufficient support for auto-formatting SQL + psql + pl/pgsql.
sqlfluff is one of the more customizable SQL formatters, but it doesn't for example support custom operators (e.g., <->), psql commands like \gset and \echo, and variable substitution like LIMIT :LIMIT.
pgFormatter worked fine with custom operators, but I think it was inadequate. It didn't for example handle CTEs well, and I didn't see a good way to fix that. Also, it didn't support "magic comments". They do support placeholders, which I tried to use for magic comments, but this caused formatting issues. (e.g., next line always started with a space)
pglast I liked because it was based on libpg_query, which is a port of Postgres's parser. I wasn't the biggest fan of all their formatting choices, but I thought it'd be easier to hack together support for the psql commands on top of it + magic comments. My attempt is included in the closed PR. In the end, it didn't handle \gset ideally, it didn't handle :'v4444' (but did handle limit :limit). I could spend some more time getting those cases to work, but then there's the additional pain of, sometimes it doesn't format \gset in the way you might like.
At that point, if you're constantly using magic comments to force it to format in an aesthetic way, what is the point?
I think if there's a nice formatter with psql support in the future, it would be great to use that.
Hi, I'm taking a fresh look at this issue. Two solutions seem possible: 1) use a third-party formatter/beautifier to standardize the SQL statements. But while there are dozens to consider, most fail to handle the PostgreSQL-specific syntax. 2) Use the PostgreSQL lexer/parser. This solution guarantees full comparability. If the query-string's syntax is correct, a parse tree is created and then the tree can be printed in standard format.
I suggest that choice (2) be implemented using a library (https://github.com/pganalyze/libpg_query).
If this approach is acceptable, I can write a testing tool.
This sounds interesting, @bobdevine !
Quick question:
Use the PostgreSQL lexer/parser. This solution guarantees full comparability
Does this guarantee compatibility with PSQL magic commands (\d
, \t+
, etc) or is it only about SQL?
Our formatter needs to handle PSQL magic commands since we use those in our tests.
It seems @dqii considered an approach similar to your proposal (2) via pglast (which is also based on libpg_query!) but that approach did not handle PSQL magic commands.
Here's a zip-file containing 2 C files to read a test file and then format all lines that are SQL. Formatting uses the PostgreSQL parser to validate the SQL command. Then the parsed SQL is "deparsed" to a canonical string.
This should format
lantern.sql
and regression test SQL files. Sqlfluff is one good option to use for this, but have not looked too much into it.