Open YANG-DB opened 1 month ago
@YANG-DB Should the new took
field only be part of a response in jdbc
format or also somehow part of a response in csv
or raw
format (and if so do we want to add it as a new column)?
Hi I think we can add a new metadata field for this purpose - similar to how it's returning today in ppl OpenSearch response
I see the following approach to implement it:
Add the took info org.opensearch.sql.executor.ExecutionEngine.QueryResponse
The actual took value from the opensearch response is available in org.opensearch.sql.opensearch.response.OpenSearchResponse#OpenSearchResponse(org.opensearch.action.search.SearchResponse, org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory, java.util.List<java.lang.String>)
and we can extract it like
/** Constructor of OpenSearchResponse. */
public OpenSearchResponse(
SearchResponse searchResponse,
OpenSearchExprValueFactory exprValueFactory,
List<String> includes) {
this.hits = searchResponse.getHits();
this.aggregations = searchResponse.getAggregations();
this.exprValueFactory = exprValueFactory;
this.includes = includes;
this.took = searchResponse.getTook().getMillis(); // <<<<------
}
Then further expose it from org.opensearch.sql.opensearch.storage.scan.OpenSearchIndexScan
like
private void fetchNextBatch() {
OpenSearchResponse response = client.search(request);
took = response.getTook(); // <<<<------
if (!response.isEmpty()) {
iterator = response.iterator();
}
}
Then extract it in org.opensearch.sql.opensearch.executor.OpenSearchExecutionEngine.execute()
Long took = null;
if (physicalPlan instanceof ProjectOperator) {
PhysicalPlan input = ((ProjectOperator) physicalPlan).getInput();
if (input instanceof TookAware) {
took = ((TookAware) input).getTook();
}
}
QueryResponse response =
new QueryResponse(
physicalPlan.schema(), result, planSerializer.convertToCursor(plan), took);
listener.onResponse(response);
Then use the took value from QueryResponse
in org.opensearch.sql.protocol.response.format.SimpleJsonResponseFormatter
Two open questions remain so far:
1) took
is only available for opensearch datasource (not for prometheus and not for spark) so there is this nasty instanceof thingy to make the distinction. Is there a better way to do it?
2) Is the general approach sufficient or are there easier ways to implement it?
See https://github.com/eliatra/sql/commit/d02920289cf58222134968a668da2a53254323e4 for a working prototype
If the prototype looks sufficient I am happy to open a PR
@salyh thanks for the research 1) IMO we need to think of a more generic way to add metadata fields to the response - can you suggest something here ? 2) lets wait with the concrete PR until we can generalize the PPL response metadata - IMO you can also check how spark response handles such metadata as an example...
Thanks
Is your feature request related to a problem? As part of the OpenSearch standard query DSL a
took
metadata field which counts the amount of time the query executed is returned in the response:A similar response is needed for PPL queries
What solution would you like? We would like to add this metadata information from the existing DSL response to the original PPL query and combine it with the PPL response as part of the returned metadata
Do you have any additional context?