microsoft / SqlScriptDOM

ScriptDOM/SqlDOM is a .NET library for parsing T-SQL statements and interacting with its abstract syntax tree
MIT License
136 stars 18 forks source link

uninitialized token indexes in some fragments #91

Open vemoo opened 1 month ago

vemoo commented 1 month ago

Using Microsoft.SqlServer.TransactSql.ScriptDom version 161.9135.0.

For example:

select 1
option(recompile)

gets parsed as:

TOKENS:
0: Select "select"
1: WhiteSpace " "
2: Integer "1"
3: WhiteSpace "\n"
4: Option "option"
5: LeftParenthesis "("
6: Identifier "recompile"
7: RightParenthesis ")"
8: EndOfFile 

FRAGMENTS:
TSqlScript {
  FirstTokenIndex = 0,
  LastTokenIndex = 8,
  Batches = [
    TSqlBatch {
      FirstTokenIndex = 0,
      LastTokenIndex = 7,
      Statements = [
        SelectStatement {
          FirstTokenIndex = 0,
          LastTokenIndex = 7,
          QueryExpression = QuerySpecification {
            FirstTokenIndex = 0,
            LastTokenIndex = 2,
            UniqueRowFilter = NotSpecified,
            SelectElements = [
              SelectScalarExpression {
                FirstTokenIndex = 2,
                LastTokenIndex = 2,
                Expression = IntegerLiteral {
                  FirstTokenIndex = 2,
                  LastTokenIndex = 2,
                  LiteralType = Integer,
                  Value = 1,
                },
              },
            ],
          },
          OptimizerHints = [
            OptimizerHint {
              FirstTokenIndex = -1,
              LastTokenIndex = -1,
              HintKind = Recompile,
            },
          ],
        },
      ],
    },
  ],
}

where OptimizerHint ends up with -1 in FirstTokenIndex and LastTokenIndex.

I think the issue here is that here

https://github.com/microsoft/SqlScriptDOM/blob/d84cc30809b29cc5497809c2b0432bf23c412c69/SqlScriptDom/Parser/TSql/TSql80.g#L6276-L6294

tokens get assigned to vParent and should be assigned to vHint.

Another example:

if 1=1
begin
    select 1
end

gets parsed as:

TOKENS:
0: If "if"
1: WhiteSpace " "
2: Integer "1"
3: EqualsSign "="
4: Integer "1"
5: WhiteSpace "\n"
6: Begin "begin"
7: WhiteSpace "\n"
8: WhiteSpace "    "
9: Select "select"
10: WhiteSpace " "
11: Integer "1"
12: WhiteSpace "\n"
13: End "end"
14: EndOfFile 

FRAGMENTS:
TSqlScript {
  FirstTokenIndex = 0,
  LastTokenIndex = 14,
  Batches = [
    TSqlBatch {
      FirstTokenIndex = 0,
      LastTokenIndex = 13,
      Statements = [
        IfStatement {
          FirstTokenIndex = 0,
          LastTokenIndex = 13,
          Predicate = BooleanComparisonExpression {
            FirstTokenIndex = 2,
            LastTokenIndex = 4,
            ComparisonType = Equals,
            FirstExpression = IntegerLiteral {
              FirstTokenIndex = 2,
              LastTokenIndex = 2,
              LiteralType = Integer,
              Value = 1,
            },
            SecondExpression = IntegerLiteral {
              FirstTokenIndex = 4,
              LastTokenIndex = 4,
              LiteralType = Integer,
              Value = 1,
            },
          },
          ThenStatement = BeginEndBlockStatement {
            FirstTokenIndex = 6,
            LastTokenIndex = 13,
            StatementList = StatementList {
              FirstTokenIndex = -1,
              LastTokenIndex = -1,
              Statements = [
                SelectStatement {
                  FirstTokenIndex = 9,
                  LastTokenIndex = 11,
                  QueryExpression = QuerySpecification {
                    FirstTokenIndex = 9,
                    LastTokenIndex = 11,
                    UniqueRowFilter = NotSpecified,
                    SelectElements = [
                      SelectScalarExpression {
                        FirstTokenIndex = 11,
                        LastTokenIndex = 11,
                        Expression = IntegerLiteral {
                          FirstTokenIndex = 11,
                          LastTokenIndex = 11,
                          LiteralType = Integer,
                          Value = 1,
                        },
                      },
                    ],
                  },
                },
              ],
            },
          },
        },
      ],
    },
  ],
}

here StatementList is missing token indexes.

llali commented 1 month ago

@vemoo I added a new test in this PR to verify the bug https://github.com/microsoft/SqlScriptDOM/pull/94 but my test passes. I'm not sure what I'm missing

vemoo commented 1 month ago

the issue is not that there are no OptimizerHints, it's that FirstTokenIndex and LastTokenIndex are -1