Closed mrmasterplan closed 1 year ago
hi @sjrusso8, check out this PR, I think this would future-proof us against further missing Databricks Spark SQL syntax.
Base: 96.90% // Head: 96.95% // Increases project coverage by +0.04%
:tada:
Coverage data is based on head (
eb916a6
) compared to base (8b789f2
). Patch coverage: 100.00% of modified lines in pull request are covered.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
@mrmasterplan This is a really interesting addition! Makes extending the keywords way easier for one-off/niche SQL favors.
Would we need to consider how to add in custom SQL regex values into SQL_REGEX
? So specific phrases like 'ZORDER BY' can be captured?
@sjrusso8 that was exactly the type of usecase I had in mind with the way the Lexer is configured. That is why I kept the file keywords.py
mostly unchanged. You can easily design your own Lexer grammar like this:
import re
import sqlparse
from sqlparse import keywords
from sqlparse.lexer import Lexer
lex = Lexer()
lex.clear()
my_regex = (
re.compile(r"ZORDER\s+BY\b", keywords.FLAGS).match,
sqlparse.tokens.Keyword,
)
lex.set_SQL_REGEX(keywords.SQL_REGEX[:38] + [my_regex] + keywords.SQL_REGEX[38:])
lex.add_keywords(keywords.KEYWORDS_COMMON)
lex.add_keywords(keywords.KEYWORDS_ORACLE)
lex.add_keywords(keywords.KEYWORDS_PLPGSQL)
lex.add_keywords(keywords.KEYWORDS_HQL)
lex.add_keywords(keywords.KEYWORDS_MSACCESS)
lex.add_keywords(keywords.KEYWORDS)
sqlparse.parse("select * from foo zorder by bar;")
in fact, I just added a test to that effect
I don't know the documentation build system used in this repo? Can anyone point me in the right direction to learn it? Then I will add documentation about this configurability feature.
@mrmasterplan I just skimmed through the pr and it looks really promising! Thanks a lot.
The documentation is located under docs/
and is using Sphinx. You can generate the documentation by changing into this directory and running make html
. If you don't have Sphinx installed in your environment you can install it with pip install sphinx
.
Thanks @andialbrecht. I added some documentation. I don't know if it fits your style. It covers a somewhat different topic so it was not straightforward to copy the style of another section.
So @andialbrecht, anything you would like me to change, or are you prepared to merge and release this?
Thanks a lot for this change!
This PR makes the
Lexer
a singleton class. This object carries the configured syntax as instance attributes. A library user who has non-standard syntax requirements is able to adapt the behavior of theLexer
to meet her needs. As an example for how to do this, please see the relevant test: