andialbrecht / sqlparse

A non-validating SQL parser module for Python
BSD 3-Clause "New" or "Revised" License
3.75k stars 699 forks source link

Alignment broken in formatted output #278

Open SoumenDass opened 8 years ago

SoumenDass commented 8 years ago

Mislined output - indentation keeps increasing for each column:

>>> sql = 'DROP TABLE IF EXISTS abcd;CREATE TABLE abcd(name VARCHAR(64) NOT NULL,value VARCHAR(64) NULL,status VARCHAR(64) NULL);'
>>> result = sqlparse.format(sql, reindent=True, keyword_case = 'upper')
>>> print result
DROP TABLE IF EXISTS abcd;

CREATE TABLE abcd(name VARCHAR(64) NOT NULL,
                                   value VARCHAR(64) NULL,
                                                     status VARCHAR(64) NULL);
>>> print result.replace(' ', '.')
DROP.TABLE.IF.EXISTS.abcd;

CREATE.TABLE.abcd(name.VARCHAR(64).NOT.NULL,
...................................value.VARCHAR(64).NULL,
.....................................................status.VARCHAR(64).NULL);
vmuriart commented 8 years ago

which version of sqlparse do you have installed? this looked like one of the bugs we fixed in 0.2.0

SoumenDass commented 8 years ago

Hmm odd.. I'm using 0.2.0. This can be reproduced on the online site as well.

vmuriart commented 8 years ago

I have a sneaking suspect that this is related to https://github.com/andialbrecht/sqlparse/issues/270 parsing for create tables and such is still a WIP and in particular the lists component which I imagine is what is causing this.

if you check the parse tree you should see the issue stand out on the lists.

try sqlparse.parse(sql)[1]._pprint_tree()

andialbrecht commented 8 years ago

yepp, this is related to #270.

levsh commented 8 years ago

Also wrong identation with VALUES statement

In [5]: sqlparse.__version__
Out[5]: '0.2.1.dev0'

In [6]: query = "insert into settings (name, value) values (1,1), (2,2), (3,3)"

In [7]: print(sqlparse.format(query, reindent=True, keyword_case='upper'))
INSERT INTO settings (name, value)
VALUES (1,
        1), (2,
             2), (3,
                  3)
spacepluk commented 8 years ago

Probably related. When running with -r -a -s -k lower -i lower I get this. Besides the empty lines, notice that the first line of each subquery has one space too much. And it's treating this and this differently for some reason.

I'd like to get something like this instead.

vmuriart commented 8 years ago

@spacepluk I think your case is a different issue. The cases above are due to incorrect parsing of statements other than select type statements. Since your's is a select statement it shouldn't have the same problem.

Can you open a new issue for yours to debug it separately? I have a few follow up questions, but I'll post it on the separate issue :)

jml commented 5 years ago

I've been mucking around trying to get a minimal example for this problem. Here are two minimal tests in the style of test_format.py:

    def test_create_statement(self):
        # TODO: jml doesn't like this behaviour. Just trying to document what's actually going on.
        # Bug here is that we aren't doing the nice columnar indenting that we do for queries.
        f = lambda sql: sqlparse.format(sql, reindent=True, comma_first=True)
        sql = '\n'.join([
            "CREATE TABLE IF NOT EXISTS user",
            "(",
            "    id BIGINT,",
            "    email VARCHAR,",
            "    name VARCHAR",
            ");",
        ])
        assert f(sql) ==  "CREATE TABLE IF NOT EXISTS user (id BIGINT, email VARCHAR, name VARCHAR);"

    def test_create_statement_type_arguments(self):
        # TODO: Wrong behaviour. Just starting with a test that demonstrates
        # current wrong behaviour. Wacky re-indentation when the type of a
        # column has parameters, e.g. VARCHAR(64).
        f = lambda sql: sqlparse.format(sql, reindent=True)
        sql = '\n'.join([
            "CREATE TABLE IF NOT EXISTS user",
            "(",
            "    id BIGINT,",
            "    email VARCHAR(32),",
            "    name VARCHAR(32)",
            ");",
        ])
        assert f(sql) == '\n'.join([
            "CREATE TABLE IF NOT EXISTS user (id BIGINT, email VARCHAR(32),",
            "                                                  name VARCHAR(32));"
        ])

They both demonstrate the current wrong behaviour, rather than the desired behaviour. The first one shows that CREATE statements have all the columns mooshed onto one line, rather than one line per column (which is what I'd expect reindent=True to do).

The second one is more pertinent to this bug. It shows that if you add arguments to the column type (e.g. VARCHAR(32) instead of VARCHAR), you induce the weird over-indentation reported by @SoumenDass.

I'm going to continue to tinker on this in my spare time, and will try to submit a PR. Please don't rely on me though.

garu57 commented 4 years ago

Any hope to see this fixed? As soon as a table has more than a few columns, the output becomes rapidly unreadable.

CREATE TABLE "wordfrequencies"
  ("document_id" INTEGER NOT NULL,
                         "word_id" INTEGER NOT NULL,
                                           "occurences" INTEGER NOT NULL,
                                                                PRIMARY KEY ("document_id",
                                                                             "word_id"),
   FOREIGN KEY ("document_id") REFERENCES "document" ("id") ON DELETE RESTRICT ON UPDATE CASCADE,
   FOREIGN KEY ("word_id") REFERENCES "words" ("id") ON DELETE RESTRICT ON UPDATE CASCADE)

Thanks.

maciek16180 commented 2 years ago

It would be great to have this fixed.