opensearch-project / OpenSearch-Dashboards

📊 Open source visualization dashboards for OpenSearch.
https://opensearch.org/docs/latest/dashboards/index/
Apache License 2.0
1.69k stars 893 forks source link

[Design] Data Explorer : Discover 2.0 + Event Explorer MDS cohesion #4991

Closed mengweieric closed 6 months ago

mengweieric commented 1 year ago

Overview

Flint and several other projects are broadening OpenSearch's capabilities, enabling it to not only support consuming OpenSearch's internal data but also data from external data sources like S3, CloudWatch and Prometheus. Once these external data sources are appropriately configured, they will be made available for selection from both data explorer and other plugins including Dashboards Observability. This document aims to discuss the datasource selector as a new user interface for shared datasource selection experience in OpenSearch Dashboards.

Requirements

Functional

Non-functional

UX Design

Prototype Overview - Datasource selector

image(25)

When utilized within the data explorer, this feature will be positioned in the top left corner of the left-hand panel with title ‘Data source’.

Data Explorer with Datasource selector (Observability view)

image(29)

Workflow Overview

The datasource selector and the services behind it are responsible for several tasks when managing datasource selection

  1. Datasource Registrations: Each plugin, when set up to support its unique data sources, will initialize these data sources and execute the registerDatasource() method to register their specific data sources with core.
  2. Datasource Data Fetching: Each plugin is responsive for its own datasource metadata fetching
    1. Index Patterns of Local Cluster: leverages core APIs to pull all index patterns from local cluster.
    2. External OpenSearch Cluster: leverages core APIs to pull external OpenSearch datasources.
    3. Flint / Prometheus: run SQL query to pull flint and prometheus datasources.
    4. Other datasources: plugins or core are responsive for getting datasource metadata of their own.
  3. Dataset Retrieval: The datasource service goes over all datasource instances to obtain their corresponding dataset for datasource selector. For instance, if the datasource is OpenSearch, the service invokes the data plugin functionalities to gather all index patterns as data sets for this cluster. Conversely, if it's a S3 datasource, it will return the datasource name as data sets to be rendered in selector UI.
  4. Context Switch: Upon selection of a datasource, the system transitions to the corresponding datasource context.
    1. Currently for datasources like flint and prometheus, they are only supported by observability plugin, therefore selection of those datasources from a UX flow perspective would redirect users to observability view, few things are highlighted during this process:
      1. View switcher shows transition from discover to log explorer
      2. Entire view including left panel is switched to Observability log explorer as they have different feature implementations.
    2. Redirections will not happen for selecting index patterns from local and external OS clusters as both Observability log explorer and discover support data explorations of index patterns, users would rely on view switcher to change view between discover and log explorer for data explorations of indexes.
  5. Language Switch: Observability requires support for more than one query language, including PPL, promQL and natural language, but the selection of different languages only result in view changes within Observability view, not switching among other views.

Architecture Proposal

To meet the outlined requirements, we will introduce a new datasource service in Dashboard core. this, when in place, is extending the current data source plugin's capabilities, enabling it to be able to support integrating not only OpenSearch datasources but also other datasources. Each integrated datasource will operate as an independent instance encompassing its own unique set of services, settings, components etc, and made available across OpenSearch Dashboards.

Components

OUI Integration

As part of our ongoing efforts to improve and standardize the user interface of OpenSearch, and the increasing use cases for the need of having a datasource selector, we propose the new datasource selector UI (pure react component mentioned above) to eventually be a OUI component. The component will initially be developed using OUI Combo Box in the P0 phase, and then be evolved into an OUI component in the OUI framework.

Location of Datasource Codebase

The entire datasource selection system will comprise a set of frontend functionalities sitting within the public folder of the core data_source plugin. These functionalities will be made accessible via the setup hook in plugin.ts. Any core components or plugins that list data_source as a dependency will have access to this datasource service. During the plugin setup stage, the datasource service initiates, reading all available data sources metadata. These metadata is then used to initialize the datasources for selecting in selector and their corresponding components and APIs.

public setup(core: CoreSetup): DataSourcePluginSetup {
    return {
        datasourceService: [datasource service to be exposed]
        datasourceSelector: [DataSourceSelector UI to be exposed]
    };
}

Interfaces and Classes

Datasource

Every instance of the datasource is fundamentally an abstraction of a specific datasource connection. Depending on its type, a datasource may have one or several instances. Given that the design of the datasource selector UI also displays the dataset (for example, index patterns as dataset for an OpenSearch datasource, tables as dataset for a flint datasource) associated with a datasource connection, each datasource implementation of getDataSet must account for their unique method of fetching the respective dataset.

/**
 * Abstract class representing a data source. This class provides foundational
 * interfaces for specific data sources. Any data source connection needs to extend
 * and implement from this base class
 *
 * DataSourceMetaData: Represents metadata associated with the data source.
 * SourceDataSet: Represents the dataset associated with the data source.
 * DataSourceQueryResult: Represents the result from querying the data source.
 */
export abstract class DataSource<
  DataSourceMetaData,
  SourceDataSet,
  DataSourceQueryResult,
  DataSetParams,
  DataSourceQueryParams
> {
  constructor(
    private readonly name: string,
    private readonly type: string,
    private readonly metadata: DataSourceMetaData
  ) {}

  getName() {
    return this.name;
  }

  getType() {
    return this.type;
  }

  getMetadata(): DataSourceMetaData {
    return this.metadata;
  }

  /**
   * Abstract method to get the dataset associated with the data source.
   * Implementing classes need to provide the specific implementation.
   *
   * Data source selector needs to display data sources with pattern
   * group (connection name) - a list of datasets. For example, get
   * all available tables for flint datasources, and get all index
   * patterns for OpenSearch data source
   *
   * @returns {SourceDataSet} Dataset associated with the data source.
   */
  abstract getDataSet(dataSetParams?: DataSetParams): SourceDataSet;

  /**
   * Abstract method to run a query against the data source.
   * Implementing classes need to provide the specific implementation.
   *
   * @returns {DataSourceQueryResult} Result from querying the data source.
   */
  abstract runQuery(queryParams: DataSourceQueryParams): DataSourceQueryResult;

  /**
   * Abstract method to test the connection to the data source.
   * Implementing classes should provide the specific logic to determine
   * the connection status, typically indicating success or failure.
   *
   * @returns {ConnectionStatus} Status of the connection test.
   */
  abstract testConnection(): ConnectionStatus;
}

interface ConnectionStatus {
  success: boolean;
  info: string;
}

Datasource_plugin-Page-3 drawio (3) (1)

Datasource Service

type DataSourceType = DataSource<
  IDataSourceMetaData,
  IDataSetParams,
  ISourceDataSet,
  IDataSourceQueryParams,
  IDataSourceQueryResult
>;

interface IDataSourceService {
  registerDatasource: (ds: DataSourceType) => Promise<IDataSourceRegisterationResult>;
  getDataSources: (filters: IDataSourceFilters) => Record<string, DataSourceType>;
}

 class DataSourceService implements IDataSourceService {
  // A record to store all registered data sources, using the data source name as the key.
  private dataSources: Record<string, DataSourceType> = {};

  constructor() {}

  /**
   * Register multiple data sources at once.
   *
   * @param datasources - An array of data sources to be registered.
   * @returns An array of registration results, one for each data source.
   */
  registerMultipleDataSources(datasources: DataSourceType[]) {
    return datasources.map(this.registerDatasource);
  }

  /**
   * Register a single data source.
   * Throws an error if a data source with the same name is already registered.
   *
   * @param ds - The data source to be registered.
   * @returns A registration result indicating success or failure.
   * @throws {DataSourceRegisterationError} Throws an error if a data source with the same name already exists.
   */
  async registerDatasource(ds: DataSourceType): Promise<IDataSourceRegisterationResult> {
    const dsName = ds.getName();
    if (dsName in this.dataSources) {
      throw new DataSourceRegisterationError(
        `Unable to register datasource ${dsName}, error: datasource name exists.`
      );
    } else {
      this.dataSources = {
        ...this.dataSources,
        [dsName]: ds,
      };
      return { success: true, info: '' } as IDataSourceRegisterationResult;
    }
  }

  /**
   * Retrieve the registered data sources based on provided filters.
   * If no filters are provided, all registered data sources are returned.
   *
   * @param filters - An optional object with filter criteria (e.g., names of data sources).
   * @returns A record of filtered data sources.
   */
  getDataSources(filters?: IDataSourceFilters): Record<string, DataSourceType> {
    if (!filters || isEmpty(filters.names)) return this.dataSources;
    const filteredDataSources: Record<string, DataSourceType> = {};
    forEach(filters.names, (dsName) => {
      if (dsName in this.dataSources) {
        filteredDataSources[dsName] = this.dataSources.dsName;
      }
    });
    return filteredDataSources;
  }
}

Datasource Pluggable

Each datasource type is accompanied by one pluggable module. This module contains a collection of UI components, along with any other custom components required to render when the corresponding datasource is selected.

Datasource_plugin-Page-4 drawio (1)

interface IDataSourcePluggableComponents {
   QueryEditor: React.ReactNode;
   ConfigEditor: React.ReactNode;
   SidePanel: React.ReactNode;
}

// For now only contains custom UI components
Class DataSourcePluggable {
   private components: IDataSourcePluggableComponents;

   public setQueryEditor(queryEditor: React.ReactNode) {
        this components.QueryEditor = queryEditor;
        return this;
   }

   public setConfigEditor(configEditor: React.ReactNode) {
        this components.ConfigEditor = configEditor;
        return this;
   }

   public setQueryEditor(sidePanel: React.ReactNode) {
        this components.SidePanel = sidePanel;
        return this;
   }

   ...
}

Each datasource type is associated with a unique pluggable module. For instance, if we have three types of datasources - a default local cluster, an external OpenSearch cluster, and a Flint datasource, when the data_source plugin is loaded, it will expose three distinct pluggable modules, each corresponding to one of these datasource types. Data explorer or any other frame or platform uses the datasource type to access the plugin modules.

// example
const datasourcePlugins = {
   OPENSEARCH: [OpenSearchExternalDataSourcePluggable],
   FLINT: [FlintDataSourcePluggable],
   PROMETHUES: [PromethusDataSourcePluggable]
};

Datasource Integration

Onboarding Datasource

Every datasource connection in OpenSearch Dashboards, whether initialized at startup (as with default datasources) or during runtime (when creating a new connection), need to go through the following 1st and 3rd steps to be integrated into the datasource selector.

1. Datasource creation

Owners of datasources are responsible for constructing their datasource classes. They should base their decision on the level of shared functionalities with existing classes in the inheritance chain of the datasource.

2. Datasource plugin creation (optional, out of scope in p0 for core, Observability only)

In phase 0, the datasource pluggable will reside within the Observability plugin temporarily. Subsequent decisions will determine whether it should be promoted to the core datasource plugin functionalities.

Owners of datasources are also responsible for constructing their datasource plugin classes for new datasource type or adding their custom components to the existing datasource type pluggable if they have custom components to render. Since we have view registration in data explorer, in p0, creating pluggable for OpenSearch datasource type is not required for core, but observability will rely on this pluggable for dynamically render UI components internally.

3. Datasource registration:

Owners of datasources are also responsible for registering their datasource with core datasource service, by calling registerDatasource API that datasource service exposes.

Using selector UI independently

Consumers who only require the datasource selection UI, and not the entire service, have the option to use just that part. Datasource selector component is essentially a wrapper around the OUI combo box. Therefore, instead of implementing the full datasource, they can integrate only the datasource selector UI into their applications.

ashwin-pc commented 1 year ago

I like the overall goal here to have a unified datasource selector that can be used across apps, however i would keep application specific details out of such a data selector since it limits where it can be used and how. For example, I like the Datasource registration and dataset retrieval part of the design, and even like how we can expose a common datasource selector UI component. The thing i'd keep out of this though is the context and language switching features since those are usually application specific. Let the underlying application handle that. Based on the datasource selected, the underlying application may choose to do different things based on the users context, something that the datasource service may not be aware of.