opensearch-project / sql

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
https://opensearch.org/docs/latest/search-plugins/sql/index/
Apache License 2.0
121 stars 140 forks source link

[FEATURE] Unify the async query request metadata model #3140

Open ykmr1224 opened 3 weeks ago

ykmr1224 commented 3 weeks ago

(moved from opensearch-spark#602 since this issue is for SQL repository)

Is your feature request related to a problem? Currently in async-query-core, we have three models which store the async query request and status: AsyncQueryJobMetadata, StatementModel, and IndexDMLResult. And those are stored in the same request index (.query_request_index_{datasourceName}) with different documentId (qid{requestId}, {requestId}, index{requestId}). These three models overwraps each other and redundant. It also causes confusion.

What solution would you like? Unify them to single model, and store single record in request index.

I propose following unified model to be used:

AsyncQueryMetadata

How to make it backward compatible?

In new version, we will use single interface to query/save the query metadata. The documentId will be queryId (without prefix). Since current StatementModel uses queryId as a key, use case for StatementModel would be fine (whether it hits older record or newer record, it contains required fields for statement use case). For the use case for AsyncQueryMetadata (It is only when GetAsyncQueryResults API is called), we can first query with queryId (without prefix), and if the version field is not "2.0", fallback to query with "qid" prefix, which would retrieve AsyncQueryJobMetadata(older) record. IndexDMLResult does not have read use case. (does not have concern on backward compatibility)

andrross commented 3 days ago

[Catch All Triage - 1, 2, 3, 4, 5]