bruce-dunwiddie / tsql-parser

Library Written in C# For Parsing SQL Server T-SQL Scripts in .Net
Apache License 2.0
324 stars 56 forks source link

Failed parsing CASE with includeWhitespace: true #117

Open teyc opened 1 year ago

teyc commented 1 year ago

Thank you for this library! I noticed I get a null reference fault here:


namespace Tests
{
   public class ParseSelect
   {
        [Test]
        public void Parse_TypicalStoredProc()
        {
            const string sql = @"SELECT DISTINCT ISNULL((
                           CASE 1
                               WHEN 2
                                   THEN 2
                               ELSE 3
                               END
                           ), '') AS F
FROM FAMILY_LOAN AS FL";

            List<TSQLStatement> statements = TSQLStatementReader.ParseStatements(sql, 
              useQuotedIdentifiers: false, includeWhitespace: true);

        }
  }
}

stack trace

Error Message:
   System.NullReferenceException : Object reference not set to an instance of an object.
  Stack Trace:
     at TSQL.Expressions.Parsers.TSQLArgumentListParser.Parse(ITSQLTokenizer tokenizer) in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/TSQL_Parser/Expressions/Parsers/TSQLArgumentListParser.cs:line 39
   at TSQL.Expressions.Parsers.TSQLValueExpressionParser.ParseNext(ITSQLTokenizer tokenizer) in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/TSQL_Parser/Expressions/Parsers/TSQLValueExpressionParser.cs:line 286
   at TSQL.Expressions.Parsers.TSQLSelectExpressionParser.ParseNext(ITSQLTokenizer tokenizer) in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/TSQL_Parser/Expressions/Parsers/TSQLSelectExpressionParser.cs:line 66
   at TSQL.Expressions.Parsers.TSQLSelectExpressionParser.Parse(ITSQLTokenizer tokenizer) in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/TSQL_Parser/Expressions/Parsers/TSQLSelectExpressionParser.cs:line 15
   at TSQL.Elements.Parsers.TSQLSelectColumnParser.Parse(ITSQLTokenizer tokenizer) in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/TSQL_Parser/Elements/Parsers/TSQLSelectColumnParser.cs:line 20
   at TSQL.Clauses.Parsers.TSQLSelectClauseParser.Parse(ITSQLTokenizer tokenizer) in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/TSQL_Parser/Clauses/Parsers/TSQLSelectClauseParser.cs:line 173
   at TSQL.Statements.Parsers.TSQLSelectStatementParser.Parse() in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/TSQL_Parser/Statements/Parsers/TSQLSelectStatementParser.cs:line 50
   at TSQL.Statements.Parsers.TSQLSelectStatementParser.TSQL.Statements.Parsers.ITSQLStatementParser.Parse() in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/TSQL_Parser/Statements/Parsers/TSQLSelectStatementParser.cs:line 170
   at TSQL.TSQLStatementReader.MoveNext() in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/TSQL_Parser/TSQLStatementReader.cs:line 89
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at TSQL.TSQLStatementReader.ParseStatements(String tsqlText, Boolean useQuotedIdentifiers, Boolean includeWhitespace) in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/TSQL_Parser/TSQLStatementReader.cs:line 110
   at Tests.ParseLoanMarket.Parse_TypicalStoredProc() in /Users/chui.tey/toyapps/tsql-parser/TSQL_Parser/Tests/ParseLoanMarket.cs:line 61

Incidentally, would you accept a PR to convert this to netstandard20 classlib and netcore60 test project?

bruce-dunwiddie commented 1 year ago

Thank you for reporting the issue.

I will verify the repro and then update this issue with the finding and ETA for a fix.

I will always take PR's. I don't necessarily always consider it an improvement to build against latest however. Can you provide links to the support status for those frameworks and their predecessors, along with your argument on why you think this should be upgraded to the minimum build? There are also options for creating separate proj files for newer frameworks.

teyc commented 1 year ago

Hi Bruce, thank you again it's a little side exploration for a wider issue, and open source is a gift from you, so please, no ETA. :)

I'm trying to figure out if I can parse some of our larger stored procs and turn them into C# and EF. It might not suit my specific use case, because we have a lot of "OUTER APPLY"

As for building on netstandard20. I'm on a Mac. I can't actually build with .NetFramework. I have been able to multitarget on a single csproj. I'll submit a PR, but I can't test the netframework part.

bruce-dunwiddie commented 1 year ago

I'd be interested in helping towards your use case. I also find code generation and such interesting. Recreating OUTER APPLY subqueries with current day EF generation should be doable. Let me know if you get stuck.

teyc commented 1 year ago

I did a bit more investigation Bruce:

// passes
const string sql = 
  "SELECT ISNULL( CASE 1 WHEN 2 THEN 2 ELSE 3 END, '') AS FROM T";

// fails
const string sql = 
  "SELECT ISNULL( CASE 1 WHEN 2 THEN 2 ELSE 3 END , '') AS FROM T";
bruce-dunwiddie commented 1 year ago

I've verified the repro.

I've merged your change into the develop branch for review. Thanks for narrowing in on the issue for me.

While I agree that your fix fixes this exact situation, I think the issue is that whitespace after the last argument within any function call is probably not being handled correctly, not specific to just CASE. So I'm trying to verify, and will possibly edit the fix to make more broad, before merging to master.

teyc commented 1 year ago

I noticed that the problem does not occur parsing argument list that does not have case statements. However argumentListParser has poor error recovery.

bruce-dunwiddie commented 1 year ago

Ok, I'll compare how those react. The other thing that's interesting in your situation is that you've added in a grouping of parens around the CASE. I don't know if that plays into the situation yet or not.

teyc commented 1 year ago

I refined the test case to a point where it was a trailing space past the CASE that triggered the null reference. There’s some code that is adding NULL to the Tokens list. I noticed there are sections where MoveNext is not checked for success or fail.

bruce-dunwiddie commented 1 year ago

Ya, it's the code that parses arguments, and then tries to access Tokens property from the argument. In this case, argument is null. I agree the logic needs to be hardened, but checking for failure in ParseNext could allow for silently dropped tokens, which worries me more.

teyc commented 1 year ago

What about if I made the Tokens read only, and provided an AddToken function? I can terminate early and make it easier to troubleshoot.

bruce-dunwiddie commented 1 year ago

I'm not sure I follow how that would solve the problems. It's not that I worry about the token handling themselves, it's more about making sure the cursor stays where we expect it, while not dropping any tokens along the way, as they get passed through all the recursion levels.