Closed kevinwcyu closed 1 year ago
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 3 committers have signed the CLA.
:white_check_mark: kevinwcyu
:x: iwysiu
:x: andresmgot
You have signed the CLA already but the status is still pending? Let us recheck it.
I'd like to provide some high level feedback here on the overall architecture before reviewing the code. So this comment is not only for changes made in this PR, but also the draft PR the grafana-aws-sdk, redshift-datasource and in grafana-aws-sdk-react. Hope that's ok!
First of all, I've tested the feature locally and it seems to work fluently. It solves the problem it aimed to solve, and the user experience is great. So really amazing job here!
Regarding the design, there's a few things that I'm a bit concerned about:
sqlds
which is supposed to be generic and provide a standardized integration with go database/sql and driver (provided by the plugin)args
in the go database/sql QueryContext method are supposed to be used for placeholders so that the query can be interpolated. Using it for passing queryId
is not the end of the world, but it is violating the api's original intentAn alternative design would be to create a new redshiftAsyncDriver
that is implementing the go database/sql/driver interface. This would handle all async logic. When a new query is executed with QueryContext
, the async driver would
1) getQueryId
2) if no existing query
start new query, return one column (status) with one row ('ongoing')
2) if existing query
get query status
if completed, return the result. If ongoing, return one column (status) with one row ('ongoing')
It would be up to the request looper in the frontend to check the status column and decide whether to make a new backend request.
It's a bit unconventional, and it would be one extra API call to check query id per backend request, but...
DataSourceWithAsyncBackend
file that we keep in the grafana-aws-sdk-react. Thoughts @kevinwcyu @iwysiu?
Thanks, @sunker
- getQueryId
- if no existing query start new query, return one column (status) with one row ('ongoing')
- if existing query
get query status
if completed, return the result. If ongoing, return one column (status) with one row ('ongoing')It would be up to the request looper in the frontend to check the status column and decide whether to make a new backend request.
Would it not be possible to set the meta field instead of returning the metadata as data? If we go this route, I'd expect us to have to choose a very unique name for the column to prevent clashing with a genuine user query.
the generic sqlds code could remain unchanged, using only one execution flow
Is there a place we could share the asynchronous flow handling? Perhaps a different package within sqlds? If I'm understanding the above correctly, each datasource would be reimplementing the flow itself, it would be nice if a datasource could opt-in to using some shared handling for async flows.
I'm still digesting some of the other suggestions, will wait for @iwysiu's thoughts as well.
Would it not be possible to set the meta field instead of returning the metadata as data? If we go this route, I'd expect us to have to choose a very unique name for the column to prevent clashing with a genuine user query.
The notion of data frames doesn't exist in go database/sql, so only rows can be returned. Meta fields could be set in sqlds, but if possible I'd like to avoid these kind of changes there.
Is there a place we could share the asynchronous flow handling? Perhaps a different package within sqlds? If I'm understanding the above correctly, each datasource would be reimplementing the flow itself, it would be nice if a datasource could opt-in to using some shared handling for async flows.
Since sqlds is built around go database/sql, and database/sql is not async by nature, we should avoid adding such code to this package. Also, the async execution flow will only be of interest for redshift and athena, not any other consumer of this package. For that reason, I think I'd like to see is the async vs sync flow routing taking placing somewhere on a higher level than sqlds.
Another approach would be to create a wrapper around sqlds that would provide this routing. This wrapper could for example be placed in grafana-aws-sdk. Some basic pseudo code here:
// creating a new ds in main.go in redshift
s := redshift.New()
ds := awsds.HigherLevelDatasource(s)
ds.Completable = s
ds.CustomRoutes = routes.New(s).Routes()
...
// in grafana-aws-sdk
type HigherLevelDatasource struct {
*sqlds.SQLdatasource
sqldsQueryDataHander backend.QueryDataHandlerFunc
}
func NewHigherLevelDatasource(driver sqlds.Driver) *HigherLevelDatasource {
sqlds := sqlds.NewDatasource(driver)
return &HigherLevelDatasource{
SQLdatasource: sqlds,
sqldsQueryDataHander: sqlds.QueryData,
}
}
func (ds *HigherLevelDatasource) QueryData(ctx context.Context, req *backend.QueryDataRequest) (*backend.QueryDataResponse, error) {
if syncExectionEnabled {
return ds.sqldsQueryDataHander.QueryData(ctx, req)
} else {
// async flow
}
}
Code not tested!
My main point here is that I'd like to keep sqlds untouched if possible.
My two cents here, I agree with the point of making a clear differentiation between the existing sync code and the new async. Creating a wrapper around the current package makes sense, since in theory, the async flow is just an extension of the sync one since datasources should support both.
grafana-aws-sdk
is maybe not the best place for it though. We probably want to use this flow for non AWS data sources (like bigquery) and placing the code there may make some assumptions about AWS. Is making a sub go package here isolation enough? IMO it is but if not we may need to create a new repo for this (which I would avoid if there is not major reason for it).
grafana-aws-sdk
is maybe not the best place for it though. We probably want to use this flow for non AWS data sources (like bigquery) and placing the code there may make some assumptions about AWS. Is making a sub go package here isolation enough? IMO it is but if not we may need to create a new repo for this (which I would avoid if there is not major reason for it).
I agree - end goal here would be to have it in its own package, or maybe even in the grafana-plugin-sdk-go. But to get a quick start, you can use the grafana-aws-sdk and move it elsewhere when the api is complete.
closing in favor of https://github.com/grafana/sqlds/pull/70 , which doesn't have the async code in sqlds.
All committers have signed the CLA.