Fixes Issue 349: Problems Parsing MySQL Delimiter

cmiles74 commented 5 years ago

I found some issues with the MySQL statement parser that were causing problems. In addition to the issues with trailing delimiters, I also saw some problems parsing parenthesis in views that had many joins. Typically this was from DDL produced by MySQL itself (i.e., the DDL was pulled from MySQL, edited, and then RoundHouse was executing the edit).

I re-wrote the parsing code with an eye specifically towards the minimum amount of parsing that would be required to render reliable statements that we could then execute. This parser will parse and handle the following types of statements:

Statements that define the delimiter
Statements of SQL

To support that I coded up a scanner that handles the minimum tokens we need to let us parse out those statements. The scanner will produce the following types of tokens:

Delimiter declaration keyword
Delimiters
Comments (delimiters in comments are ignored, comments in stored procedured are retained)
Quote (delimiters in quotes are ignored)
ANSI Quote (delimiters in ANSI quotes are ignored if ANSI quotes are being honored)
White space (we want to retain the white space present in stored procedures)
Text (delimiters in text denote the end of the statement)

I have tested this with my own SQL and it works well. It handles nested parenthesis as well as trailing delimiter declarations.

In the future, I would like to revisit this code and perhaps alter it to work by reading a buffered stream of data instead of pulling the entire script into a variable and producing a list of tokens and a list of statements. That said, this code works in a manner similar to the existing MySQL parser.

Thank you!

erikbra commented 5 years ago

This is great stuff, @cmiles74! Love that you have separated out the parser logic into a separate class. There are a couple of linter errors. There is a roundhouse.sln.DotSettings file in the root folder that controls these rules. They may be a bit unfamiliar (at least they were to me). I don't know if you have access to e.g. Visual Studio or Jetbrains' Rider IDE, I think both of these can give you the linter tips directly in the IDE. I don't think there is any plugin support for .DotSettings files in VSCode, which you seem to be using.

Other than this, stuff looks good. What would make it even better, though, is unit tests. I would really appreciate if you could make some simple tests that demonstrate the logic in the parser. I know the test coverage of the parsing is virtually non-existant for MySql, but, especially since I'm not a MySql pro, tests would really help me understand and validate the logic of the parser.

That said, splitting out into parser classes is definitely a step in the right direction, and I think it this in time could be generalised to be used in the other DB providers too.

cmiles74 commented 5 years ago

I think this last commit covers all of the remaining static analysis issues. There are five items still flagged, but they seem more like a matter of style. Let me know if you'd like them changed one way or the other.

Thank you! :-)

erikbra commented 5 years ago

I'm really fine with the remaining linter issues, no worries.

But I would really appreciate some tests on the new classes, at least the parser, ideally with some sample SQL (inline or in files, I don't care), and test on the desired splitting behaviour. Would you be able to create some as part of this PR?

cmiles74 commented 5 years ago

Understood, I will try to get some together in the next day or two.

erikbra commented 5 years ago

That would be perfect, thanks. And, again, I really appreciate the effort and attention to quality and detail of this PR.

cmiles74 commented 5 years ago

Okay, I think the test I've added cover the important stuff. They all revolve around splitting the SQL script out into statements and they test for the following...

the correct number of statements
the preservation of comments
single character delimiters
two character delimiters
multiple character delimiters

If you think of other tests that make sense, let me know and I will get them added.

erikbra commented 5 years ago

This looks good now, @cmiles74 . Thanks a lot for your contribution!

chucknorris / roundhouse

Fixes Issue 349: Problems Parsing MySQL Delimiter #357