Closed ArsenShnurkov closed 9 years ago
I'm probably going to close this one, as this is probably too broad to call a feature request. I mean, it's essentially "rewrite the SQL parser". That does need to be done, certainly, but all the good research you've got here basically reduces to, "we need to research parser options". I might come back and attach this to a milestone if I branch and successful move from the rats' nest to another parser, however, or edit it once I'm convinced a certain parser engine is the way to go.
Interestingly, the one project in your initial draft of this feature request, DeveelDB, was using Irony initially! I'm not sure why Irony wouldn't support PCL, but that's their reason for moving. It looks like what they're planning to use is ANTLR (see also here), though obviously the C# port.
It also looks like it'd be worth adding that project (DeveelDB) to your list of C# native DBMSs. It seems it's already fairly mature (surprised I didn't google it up much earlier; it's been around since 2009 in some form) and does a lot more than SqlDbSharp.
If you want to keep posting more findings from your research on this closed issue, though, feel free to. I'm pretty happy to see all that you've already found, which is much more than I had before starting SqlDbSharp.
@ruffin-- thanks for the interest to my project. DeveelDB is way much older than 2009: it was started as Minosse RDBMS, but being an academic project for the university, I dropped it fast and took it back and forth during the years, trying to make something concrete out of it. For me is like a hobby, since I have a totally different job, and I enjoy the new line I gave the project, rewriting it from scratch.
In fact, Minosse RDBMS and DeveelDB 1.0 were based on my own port of JavaCC (an abandoned project named MinosseCC/CSharpCC), but recently, following the new line of things, I decided to adopt Irony. Performances are great and I have total control on the code, the rules, parsers and so forth. I had to make some modifications myself to the project (the latest one to support PCL), but I released them to the GitHub branch created by my friend Atsushi Enomoto.
The parser you will find in DeveelDB is far from being complete yet: the thing is still ongoing and I'm still working on the statement parsing (especially the PL/SQL syntax for DeveelDB). Feel free to ask further information, if you need. Although, text processing is not really my field.
https://en.wikipedia.org/wiki/Comparison_of_parser_generators
So, I want to find an evaluation of this three (from most to least promising): YaccConstructor, F#, GLL, Apache 2.0, http://yaccconstructor.github.io/YaccConstructor/gll.html GLRSharp, GLR, C# v4, MIT, https://github.com/jcoder58/GLRSharp NLT, GLR, C#, MIT, http://sourceforge.net/projects/naivelangtools/
I have read your thoughts about Irony https://github.com/ruffin--/SqlDbSharp/issues/1#issuecomment-124851811 ,
== LL, LR & LALR == https://en.wikipedia.org/wiki/Irony_(framework) MIT, LALR(1) parser uses nontraditional approach (C#-based syntax description), and this scare me (I need to learn more).
here http://stackoverflow.com/questions/5975741/what-is-the-difference-between-ll-and-lr-parsing yacc-like (LALR = Look-Ahead LR) parser generators are claimed to be more powerfull. Here two projects: http://stackoverflow.com/a/2552225/1709408 GPPG, C#, BSD, https://gppg.codeplex.com GPLEX, C#, BSD, https://gplex.codeplex.com
Here http://stackoverflow.com/questions/1194584/what-is-a-good-c-sharp-compiler-compiler-parser-generator CSharpCC is recommended (If LL ~= recursive descent) http://github.com/deveel/csharpcc C#, LL(k), BSD
http://mortoray.com/2012/07/20/why-i-dont-use-a-parser-generator/ "Getting location information is a hassle, and never seems natively supported."
=== LALR => GLR ==
https://en.wikipedia.org/wiki/LALR_parser "the C-language and C++ parsers of the Gnu Compiler Collection ... started as LALR parsers but were later changed to recursive-descent parsers." see http://stackoverflow.com/questions/6319086/are-gcc-and-clang-parsers-really-handwritten/6319216#6319216
=== GLR & GLL === https://en.wikipedia.org/wiki/GLR_parser Generalized LR is described as more powerfull than LALR http://stackoverflow.com/a/6320330/1709408 https://github.com/jcoder58/GLRSharp http://stackoverflow.com/questions/4128609/any-glr-parser-generators-for-net Hime parser is advised: https://bitbucket.org/laurentw/hime/overview LGPL 2.1 One more GLP parser: https://code.google.com/p/recursive-ascent/wiki/RNGLR
There is also GLL parsing exists: http://dotat.at/tmp/gll.pdf http://yaccconstructor.github.io/YaccConstructor/gll.html Apache 2.0, F# language
==="tokenless" parsing=== http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.37.7828&rep=rep1&type=ps (extracted from this article - http://tratt.net/laurie/blog/entries/parsing_the_solved_problem_that_isnt )