GoogleCloudPlatform / zetasql-toolkit

The ZetaSQL Toolkit is a library that helps users use ZetaSQL Java API to perform SQL analysis for multiple query engines, including BigQuery and Cloud Spanner.
Apache License 2.0
39 stars 9 forks source link

Getting the Executed line for each Resolved Statement #16

Closed SDorgan closed 1 year ago

SDorgan commented 1 year ago

Currently, the Statement Iterator analyzes each statement in a Query and returns only the Resolved Statements with the respective dependency trees. I'm looking for a way to obtain both the Resolved Statements, and what statement was being executed. Example, If my inputing Query is:

CREATE TABLE t1 AS (SELECT 1 AS column);
CREATE TABLE t2 AS (SELECT 1 AS column);

I would like my output to look like so:

Next line: CREATE TABLE t1 AS (SELECT 1 AS column);
Result:
CreateTableAsSelectStmt
[etc]

Next line: CREATE TABLE t2 AS (SELECT 1 AS column);
Result:
CreateTableAsSelectStmt
[etc]

Is there a way to do this without manually parsing the file and inputing each statement into the ZetaSQLToolkitAnalyzer?

ppaglilla commented 1 year ago

Since version 0.4.0 of the toolkit, performing analysis returns an Iterator of AnalyzedStatement objects. Those currently contain the parsed statement (an ASTStatement object) and, when possible, the ResolvedStatement object.

I'll add the original query to the AnalyzedStatement object in the next release of the toolkit, so it will be readily available.

In the meantime, here's an example of how you can work around not having it. Parsed statements contain the location of the statement in the original query string, so you can use those to substring the original.

String query = "CREATE TEMP TABLE `t` AS (SELECT 1 AS column);\n"
    + "SELECT * FROM `t` WHERE column = 1;";

BigQueryCatalog catalog = ...;
AnalyzerOptions options = ...;
ZetaSQLToolkitAnalyzer analyzer = new ZetaSQLToolkitAnalyzer(options);

Iterator<AnalyzedStatement> statementIterator = analyzer.analyzeStatements(query, catalog);

statementIterator.forEachRemaining(analyzedStatement -> {
  // Use the parse location to get the query segment for this statement
  ASTStatement parsedStmt = analyzedStatement.getParsedStatement();
  ParseLocationRange stmtLocation = parsedStmt.getParseLocationRange();
  String querySegment = query.substring(stmtLocation.start(), stmtLocation.end());

  Optional<ResolvedStatement> maybeResolvedStmt = analyzedStatement.getResolvedStatement();

  System.out.printf("Next statement: %s\n", querySegment);
  System.out.println("Result:");
  maybeResolvedStmt.ifPresent(System.out::println);
});
SDorgan commented 1 year ago

Exactly what I was looking for, didn't realize it was added in 0.4.0. Thanks!

ppaglilla commented 1 year ago

You're welcome! Thank you for using the tool!