andialbrecht / sqlparse

A non-validating SQL parser module for Python
BSD 3-Clause "New" or "Revised" License
3.63k stars 685 forks source link

UNKNOWN type in an INSERT #762

Closed michelmetran closed 3 months ago

michelmetran commented 5 months ago

Describe the bug

When I'm using an "INSERT" preceded by "USE dbname; GO;", typically used in SQL Server, the first INSERT gives an error, being recognized as UNKNOWN .

Example code follows.

import sqlparse

sql = """
USE ESTAGIARIO
GO

INSERT INTO estagiario.dbo.Tab_Geral (Promo_Atual, Naturalidade, Dt_Nascimento) values ('PJ CRIMINAL DE OSASCO', 'Piracicaba', '2002-5-7');
INSERT INTO estagiario.dbo.Tab_Geral (Promo_Atual, Naturalidade, Dt_Nascimento) values ('PJ DE LENÇÓIS PAULISTA', 'São Paulo', '2002-5-7');
INSERT INTO estagiario.dbo.Tab_Geral (Promo_Atual, Naturalidade, Dt_Nascimento) values ('3ª PROMOTORIA DE JUSTIÇA CRIMINAL DA CAPITAL', 'Santos', '2002-5-7');
"""

for text in sqlparse.parse(sql):    
    print(text.normalized)
    print(text.get_type())


Output....

USE ESTAGIARIO
GO

INSERT INTO estagiario.dbo.Tab_Geral (Promo_Atual, Naturalidade, Dt_Nascimento) values ('PJ CRIMINAL DE OSASCO', 'Piracicaba', '2002-5-7');
***UNKNOWN***

INSERT INTO estagiario.dbo.Tab_Geral (Promo_Atual, Naturalidade, Dt_Nascimento) values ('PJ DE LENÇÓIS PAULISTA', 'São Paulo', '2002-5-7');
***INSERT***

INSERT INTO estagiario.dbo.Tab_Geral (Promo_Atual, Naturalidade, Dt_Nascimento) values ('3ª PROMOTORIA DE JUSTIÇA CRIMINAL DA CAPITAL', 'Santos', '2002-5-7');
***INSERT***


Expected behavior I understand that there is a problem when the INSERT is preceded by tags not recognized by sqlparser (USE, GO)


Versions (please complete the following information):


Additional context Maybe this problem is also related to the issue #727

andialbrecht commented 3 months ago

Documentation of GO: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/sql-server-utilities-statements-go?view=sql-server-ver16

This looks interesting... GO acts as a statement terminator, but according to the docs a semicolon after GO shouldn't be used. I think that's why the parser is a bit confused here ;)