Regenerated the ANTLR parser based on the modified grammar with antlr4 -Dlanguage=Python3 -visitor SqlBaseLexer.g4 SqlBaseParser.g4 -o generated.
Modified the generated grammar to support hints. There were a few issues with hints previously:
The isHint() method was entirely in Java, so I converted it manually to Python-compatible code.
Java has a char type which can be used to compare ascii values of characters with integers, but this comparison needs to be done explicitly in Python with chr(...)
The final isHint() method looks like this:
"""
This method will be called when we see '/*' and try to match it as a bracketed comment.
If the next character is '+', it should be parsed as hint later, and we cannot match
it as a bracketed comment.
Returns true if the next character is '+'.
"""
def isHint(self) -> bool:
nextChar = self._input.LA(1)
if chr(nextChar) == '+':
return True
else:
return False
Added a Hint tree node to the AST to store parsed hint statements. Also added a list of hints to SelectExpression.
Summary
This change adds support for parsing Spark SQL hints using the ANTLR parser and our custom SQL AST.
The following changes were made:
antlr4 -Dlanguage=Python3 -visitor SqlBaseLexer.g4 SqlBaseParser.g4 -o generated
.Modified the generated grammar to support hints. There were a few issues with hints previously:
isHint()
method was entirely in Java, so I converted it manually to Python-compatible code.char
type which can be used to compare ascii values of characters with integers, but this comparison needs to be done explicitly in Python withchr(...)
The final
isHint()
method looks like this:Hint
tree node to the AST to store parsed hint statements. Also added a list of hints toSelectExpression
.Test Plan
Added some tests for various Spark hints pulled from https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-hints.html
make check
passesmake test
shows 100% unit test coverageDeployment Plan
N/A