Open pnezis opened 1 week ago
I think I have a temporary fix for this using regex replaces. Not an ideal long term solution but fixes this in the short term until the Rust library is fixed. Should be releasing that later today!
Going to be playing around with this for a bit but from some initial tests it seems to fix any issues introduced by the Rust library https://github.com/akoutmos/sql_fmt/commit/23ce12ec2c137f7d53e6727cc251d3e758d0f447
I don't think it's that simple, this would introduce other issues, for example the following input:
SELECT 2 * -3
would be converted to SELECT 2 *-3
. This may be a valid SQL but for other operators it will lead to invalid queries. The only guaranteed way to solve this is at the tokenisation level.
Another example that makes this more clear:
SELECT 2 - - 3;
which returns 5
would be formatted to
SELECT 2 -- 3;
which evaluates to 2
Good points. Perhaps another route needs to be explored if these issues cannot be sorted out in the sqlformat-rs package.
There is the https://github.com/apache/datafusion-sqlparser-rs project which offers a lexer and parser for various SQL dialects. But there is no formatter provided so I (or we if you are interested lol) would have to build one after the library generates the AST of the provided SQL statements. The library provides a Visitor
trait for the various https://docs.rs/sqlparser/latest/sqlparser/ast/trait.Visitor.html AST nodes...so a formatted should be possible. The added benefit of using an actual lexer+parser is that it will generate error messages for invalid SQL whereas right no sqlformat-rs is happy to process whatever you give it.
Implementing a formatter given an AST is doable but it will need a lot of work. On the other hand fixing tokenisation of operators on the sqlformat-rs
package is much easier.
My only concern currently is if sqlformat-rs
is production ready and if any other such bug exists. In our case the test suite broke, but a formatter that modifies the sql queries may lead to subtle bugs that they will be a nightmare to detect.
This is a bug of the rust crate.
I would suggest to add a WARNING in the docs till this is fixed since it will break queries using such operators.