bruce-dunwiddie / tsql-parser

Library Written in C# For Parsing SQL Server T-SQL Scripts in .Net
Apache License 2.0
324 stars 56 forks source link

Issue with offset and fetch keywords #75

Closed byronlong99 closed 3 years ago

byronlong99 commented 3 years ago

Seems like when I go from version 1.2.9 to 1.3.0, longer select statements seem to get cut short. I have a library built on top of this library with a test suite and one of my tests seems to fail with the below query:

SELECT FROM Product.Product P ORDER BY P.ProductId OFFSET (@Page -1) @RowPerPage ROWS FETCH NEXT @RowPerPage ROWS ONLY

Not sure what it is about this particular query but it seems to only return 14 tokens where as I think it should have roughly 26. In the older version it was returning more than 14.

I'm trying to add support for update statements to my library which it seems has been added in a later version but unfortunately it breaks my existing functionality when I upgrade to the latest version of tsql-parser.

I could try to debug your library and submit a PR if you'd like.

Awesome library BTW. Does a ton of the dirty (parsing) work for you.

bruce-dunwiddie commented 3 years ago

Thank you for letting me know about this.

I'll happily review a PR if you can dig into it, but I'll also review over the next couple of days.

Can you give me some of the code you're using or a simple repro?

Are you using the tokenizer, or the statement parser? It's possible the statement parser would previously see this as a single statement, and now it's seeing two or more.

To get all the additional statement types working I had to get much more specific on my parsing, including inside parens. I think I had an OFFSET example in my tests, but it might be something as simple as it not handling parens inside the OFFSET expression, and I'd just need to switch what's allowed there.

Bruce Dunwiddie

byronlong99 commented 3 years ago

Sure, I'm actually using TSQLStatementReader.ParseStatements to build the list of statement parsers.

It's specifically the OrderBy property. It should contain around 19 tokens but only has 14.

Maybe I need to tweak something on my end?

Here is the snippet of my code. I'm not actually using a hardcoded string in the code, but that is exactly what my test passes.

 var statements = TSQLStatementReader.ParseStatements("SELECT * FROM Product.Product P ORDER BY P.ProductId OFFSET (@Page -1) * @RowPerPage ROWS FETCH NEXT @RowPerPage ROWS ONLY");
            var statement = (TSQLSelectStatement) statements[0];

            _clauseConverters = new List<ClauseProcessor>
            {
                new GeneralClauseConverter(statement.Select),
                new FromClauseConverter(statement.From),
                new WhereClauseConverter(statement.Where),
                new GeneralClauseConverter(statement.GroupBy),
                new OrderByClauseConverter(statement.OrderBy)
            };

Let me know if you need any more info!

bruce-dunwiddie commented 3 years ago

The issue is caused by the word FETCH, which can also be used as the start of a new statement, e.g.

FETCH NEXT FROM contact_cursor;

https://docs.microsoft.com/en-us/sql/t-sql/language-elements/fetch-transact-sql?view=sql-server-ver15

The parser currently looks for keywords that can mark a new statement start as a signal to stop parsing the current statement:

https://github.com/bruce-dunwiddie/tsql-parser/blob/6f9194dbbaa8aeda48e5f4065cef59503161765a/TSQL_Parser/TSQL_Parser/Clauses/Parsers/TSQLOrderByClauseParser.cs#L26

https://github.com/bruce-dunwiddie/tsql-parser/blob/6f9194dbbaa8aeda48e5f4065cef59503161765a/TSQL_Parser/TSQL_Parser/Clauses/Parsers/TSQLSubqueryHelper.cs#L48

https://github.com/bruce-dunwiddie/tsql-parser/blob/6f9194dbbaa8aeda48e5f4065cef59503161765a/TSQL_Parser/TSQL_Parser/Statements/TSQLKeywordStatementExtensions.cs#L50

But the syntax you're using is valid, e.g.

https://docs.microsoft.com/en-us/sql/t-sql/queries/select-order-by-clause-transact-sql?view=sql-server-ver15

So unfortunately, I think this means needing to create a TSQLOffsetClauseParser, and adding OFFSET to the stop words within the current TSQLOrderByClauseParser logic, so that the logic can be overridden. It's not horrible, and there are patterns that can be applied, just takes some carefulness and applied brain power.

Bruce Dunwiddie

byronlong99 commented 3 years ago

Cool, are suggesting that I create a PR?

bruce-dunwiddie commented 3 years ago

Cool, would you like me to add that class and submit a PR?

Absolutely, if you have the time, energy, and brain power.

I also just noticed an oddity, that may or not be correct. I currently have OFFSETS as a TSQL keyword, but not OFFSET.

https://github.com/bruce-dunwiddie/tsql-parser/blob/6f9194dbbaa8aeda48e5f4065cef59503161765a/TSQL_Parser/TSQL_Parser/TSQLKeywords.cs#L151

I try to keep that list only being the "official" reserved words, e.g. https://docs.microsoft.com/en-us/sql/t-sql/language-elements/reserved-keywords-transact-sql?view=sql-server-ver15 or words that you can't use as table names without escaping, but this one might need to be researched.

Bruce Dunwiddie

byronlong99 commented 3 years ago

I do see it mentioned here in this article on the order by clause.

https://docs.microsoft.com/en-us/sql/t-sql/queries/select-order-by-clause-transact-sql?view=sql-server-ver15

It's strange that it's not in the list that you posted but mentioned in other documentation.

byronlong99 commented 3 years ago

Hi Bruce.

Is there a way I can get access to commit to a branch? I created a hotfix off the master branch. I think there is a slight bit more of organization of the code that I need to do (move the fetch clause parser to it's own class). Seems I don't have access to create new branches.

Thanks!

Byron

bruce-dunwiddie commented 3 years ago

I would suggest you treat your entire fork as the branch, that you can then submit back to me as a pull request into my Dev branch. I don't currently see any reason you'd want to create an additional branch within your fork, or that I'd want to add a branch, that you'd then have to resync your fork from that branch. That all sounds overly complicated, especially for a single feature.

byronlong99 commented 3 years ago

Maybe I misunderstood.

I haven't created a fork before on github so I thought I was just branching and merging against your repo.

Seems that a fork is my own copy which I have control over and there would be no need to create branches. I would just commit to the master branch and from there I can try to reintegrate back into your repo via pull request.

Is that correct?

Thanks!

bruce-dunwiddie commented 3 years ago

Yes. That is how most users work against public GitHub. Private GitHub with a dedicated team would most likely have a different workflow. This also technically for your situation allows for you to create a forked build with the fixed feature within your application until I can create a new public release with the fixed feature.

byronlong99 commented 3 years ago

Hi Bruce.

It seems I've created the PR. I didn't create any additional tests though. All the tests are passing and my tests were passing as well after making these changes.

Let me know if you have any suggestions on the PR.

Thanks!

Byron

bruce-dunwiddie commented 3 years ago

76

bruce-dunwiddie commented 3 years ago

https://www.nuget.org/packages/TSQL.Parser/1.5.3