Open BionIT opened 5 months ago
Hi @mengweieric , thanks for sharing your knowledge about the data source selector and please let me know if there is any concern or question in regards to this request
A few comments on the requirements:
These are honestly questions for @shanilpa and @kgcreative unless these are technical requirements :)
There's been some ambiguity in terms. In this case, this refers to a data source connection
but we've been using data source
as a short hand. Let's align terminology with @dagney here, as this can quickly lead to awkward nomenclature
For now, Select cluster
seems to fix the term confusion
There's been some ambiguity in terms. In this case, this refers to a
data source connection
but we've been usingdata source
as a short hand. Let's align terminology with @dagney here, as this can quickly lead to awkward nomenclature
@kgcreative @dagneyb @BionIT @ashwin-pc @mengweieric
Connection could be used to define required parameters in order to establish to a connection. while we are moving to multi-datasource world. there could be more data source type we want to support with different type of connection.
e.g. OpenSearch/Elasticsearch is cluster based architecture RESTFul service. in opensearch_dashboards.yml we could use following parameters to define which opensearch for OSD connect to.
opensearch.hosts: ["http://localhost:9200"]
opensearch.username: "opensearch_dashboards_system"
opensearch.password: "pass"
Let's also look at non-OpenSearch datasource type and connection. e.g. MySql connection will need host, auth, protocol in order to connect.
mysql --host=localhost --user=myname --password mydb --protocol={TCP|SOCKET|PIPE|MEMORY}
Now we could see the common part, each connection need to define at least host, auth, protocol in order to connect.
However we may has question, does DataSource equals to Connection? maybe yes, in our current setup. But let's look at more comprehensive case. A customer setup their own OpenSearch Dashboards instance with below config with multiple datasource disabled, it means they use their own index name for OpenSearch Dashboards index. When customer choose to enable multiple datasource and migrate to that setup. Not only provide connection information for OSD to connect to OpenSearch, we will also allow customer to config their default index name so, OSD know which user specific index that stored the Dashboards saved object meta, then OSD could locate right index to load data for migration.
opensearchDashboards.index: ".my_opensearch_dashboards"
Summarize the information
Additional callout: Index pattern or future DataView are not a datasource. Index Pattern, DataView, DataTable, are logical/physical collection of data to be consumed for specific usage. Datasources map to Server, Cluster or Database which organize Index Pattern, DataView, DataTable in a way to resolve more comprehensive business need.
It is great to see we start to think reusable component. e.g. https://github.com/opensearch-project/OpenSearch-Dashboards/pull/5167 However we need to be sure design reusability for each specific scenario and scope.
Follow the single responsibility principle, It is suggested to design two picker components separately, avoid over engineering. 1) DataSource picker, and 2) DataSet picker (for DataSet, DataView, DataTable, Index Pattern)
In the Migration scenario for multiple datasource feature, we will need first one.
@seraphjiang I'm with you here. After speaking to @bandinib-amzn I too agree that we need 2 different pickers for the two separate usecases. If we can disambiguate the two names, that is ideal. Datasource is overloaded at the moment and we need alignment on it. I am not stuck on any particular name, and your suggestion of "1) DataSource picker, and 2) DataSet picker (for DataSet, DataView, DataTable, Index Pattern)" works for me. @kgcreative @dagneyb what do you think?
So there's four layers that we are conflating here, and I think as we make more data available, this will continue to add confusion
I think we really need to nail down the terminology and different layers here, or things are going to get really confusing really fast.
Proposing names for these to clarify going forward:
Thoughts? @brijos @anirudha @ashwin-pc @seraphjiang @BionIT @kamingleung
I agree with "Data connection" as a high level concept, but I think "Data set" needs some deeper thought
Agree that "Data connection" makes sense in sample data and devtools when selecting the cluster. Since we are thinking of making the names less confusing, I think we should review the existing terms we used in the dashboard, and make sure we can have it clear in next release. Right now, we have data source in management and also as a plugin and pickers refer different data source concept in the dashboard
@dagneyb Im aligned with the names.
Is your feature request related to a problem? Please describe.
In https://github.com/opensearch-project/OpenSearch-Dashboards/issues/5717, we found that devtools and tutorial add sample data page both have duplicated code for a data source picker, in https://github.com/opensearch-project/OpenSearch-Dashboards/issues/5712, we want to use data source picker as well. To avoid code duplication, we proposed to extract the duplicate code logic into one component to satisfy the use case. After found that there is an experimental data selector component implemented in https://github.com/opensearch-project/OpenSearch-Dashboards/pull/5167, we suggest to make it adaptable to our use cases in multiple data source.
Describe the solution you'd like
To give full context about the picker finalized with UX in https://github.com/opensearch-project/OpenSearch-Dashboards/issues/5712,, how we want the data source picker to behave is that and please note what data source means in multiple data source is a connection to cluster:
Data connection
Local cluster
chosenSelect a data connection
in the inputDescribe alternatives you've considered
Have a separate data source picker for multiple data source
Additional context
Add any other context or screenshots about the feature request here.