Closed SDorgan closed 1 year ago
The first thing that comes to mind is that you might want to use the parser instead of using the analyzer. The parser only performs syntactic validation on queries and returns the syntax tree (in the form of an ASTStatement
).
You'll see that the parse tree is a direct representation of the query text. It does not have things like the types of columns, the function signatures, etc. But it might be enough for your use case. Feel free to share what your use case is and I might be able to help determine if you can just use the parser.
This is a short example of how to only parse a statement. The ASTStatement
is the root of the syntax tree. You can work with it directly or use an implementation of the ParseTreeVisitor
class to traverse it.
String query = "UPDATE t SET column1 = '';";
ASTStatement parsedStatement = Parser.parseStatement(query, BigQueryLanguageOptions.get());
System.out.println(parsedStatement);
Now, talking about the analyzer a little.
The analyzer takes the output from the parser, performs semantic validations and builds the resolved tree. Since it performs semantic validations; it fails when a table doesn't exist, when a function call has an invalid signature, when there's a missing GROUP BY
, etc. The resolved tree can't really be built without having catalog information.
If you don't provide a catalog yourself when analyzing with the toolkit, the library will start with an empty catalog. It then takes care of maintaining that catalog when you, for example, analyze a CREATE
statement. I see now that the AnalyzeWithoutCatalog
name might not be great for the example 😅.
So, if you're using the toolkit, this will be analyzed just fine:
CREATE TEMP TABLE `t` (
column1 STRING
);
INSERT INTO `t`(column1) VALUES ('');
SELECT column1 FROM `t`;
But analyzing this will always fail, unless you provide a catalog where t
exists:
SELECT column1 FROM `t`;
Sorry for the late response. I think the Parser with the parseScript
function might be what I was looking for actually. I'll look into the ParseTreeVisitor
as well. Thanks once again 😄
You're welcome! Thank you for the question!
I’m having troubles with the usage of the analyzer in the AnalyzeWithoutCatlog example. If I try to validate the following queries by themselves:
SELECT column from t;
orUPDATE t SET column1 = '';
The client will always answers with:
INVALID_ARGUMENT: Table not found: t
I’m looking into analyzing queries and accessing their syntax tree without any Catalog Validation. Is there a way to achieve this with the current toolkit?