ananzh commented 4 months ago

Background

The primary problem we are addressing is the need for more advanced and customizable data visualization capabilities in OpenSearch Dashboards. While VisBuilder reached General Availability (GA) in version 2.15, it is currently limited to a few chart types and lacks the comprehensive set of controls necessary for complex visualizations. Enhancing VisBuilder to incorporate more complex controls will provide users with powerful tools for data analysis and reporting, thereby improving the overall user experience and functionality of OpenSearch Dashboards. Additionally, from a technical perspective, we aim to streamline the visualization process by consolidating the multiple existing libraries (such as timeline, vislib, and vega) into a single, cohesive library. This unification will simplify the development and maintenance of visualizations, ensuring consistency and ease of use for developers and users alike.

Requirements and Considerations

Requirements

Technical Requirements:

Integrate Vega and Vega-Lite visualizations into OpenSearch Dashboards, supporting a wide range of aggregations, like metric and bucket aggregations as defined in METRIC_TYPES and BUCKET_TYPES in AggConfigs for OpenSearch data.
Allow users to create and customize visualizations using the Vega editor.
Ensure seamless migration from existing visualization tools to Vega in VisBuilder.

Non-Technical Requirements:

Enhance user experience by providing a more flexible and powerful visualization tool through adding more controls.
Ensure that the new tool is intuitive and easy to use for users transitioning from older tools.

Considerations and Optimizations

Optimizations:

Flexibility and Customization: Optimize for the ability to create highly customized visualizations.
User Experience: Ensure that the tool is easy to use and integrates smoothly with existing workflows.
Extensibility: Design the solution to be easily extensible for future enhancements and new features.
Reuse Existing Components: Leverage existing components like expression, embeddable, and VegaVisEditor to lower integration risks and ensure compatibility with other parts of the system.
Performance: Maintain the performance of the current VisBuilder when using Vega visualizations.

Non-Prioritized Aspects:

Latency: While performance is important, we do not prioritize ultra-low latency over customization and flexibility.
Redundancy: Focus is on functionality and user experience rather than on high redundancy.

Out of Scope

Backend Data Processing Enhancements: This design does not cover improvements or changes to the backend data processing capabilities. The focus is strictly on the visualization layer.
New Data Sources Integration: Integrating new data sources is currently outside the scope; the design assumes existing data sources are sufficient.
Non-Vega Visualizations: Enhancements or changes to non-Vega visualization tools are not covered, as the focus is on integrating Vega.
A new vega type vis directly in VisBuilder: This is implementable. But it is not clear what is the benefit to integrate the entire vega vis into VisBuilder.

Current Workflow

VisLib in VisBuilder Workflow

Vega Vis Workflow

Spec Parsing: The Vega spec JSON file is parsed and validated. This ensures the spec adheres to the Vega or Vega-Lite schema.
Data Retrieval: If the spec includes OpenSearch queries (usually in the data.url section), these queries are executed against the OpenSearch cluster. The results (raw response) are fetched and prepared for use in the visualization.
Context Integration: OpenSearch Dashboards-specific context (like index pattern, time range filters, dashboard filters) is applied to the spec. Special placeholders like %timefield%, %context%, etc., are replaced with actual values.
Data Transformation: Any data transformations specified in the Vega spec are applied to the retrieved data. This might include operations like filtering, aggregating, or calculating new fields.
Vega Runtime Compilation: The parsed spec is compiled into a runtime representation that Vega can execute. This compilation process resolves data sources, scales, and other components defined in the spec.
Rendering: The Vega runtime executes the compiled spec.This generates the actual Canvas elements.
Integration with OpenSearch Dashboards: The rendered visualization is integrated into the OpenSearch Dashboards interface. This includes handling interactions like zooming, panning, and tooltips.

Proposed Design

Key Deliveries for 2.16

Note: This is not a complete version. It is just for demo purpose.

https://github.com/opensearch-project/OpenSearch-Dashboards/assets/79961084/c93519b8-4eb7-437b-b19a-c6f710faeffd

1. Vega Integration in VisBuilder

Extend the existing visualization slice to include Vega-specific state and actions.
Implement a set of reusable utility functions that generate Vega specifications:
- buildDataUrl: Constructs the data URL for OpenSearch queries
- parseAggStructure: Parses the aggregation structure for easier transformation
- generateTransform: Creates Vega transform steps based on the aggregation structure
- buildEncoding: Generates encoding specifications for visual properties
- buildVegaSpec: Assembles the complete Vega specification
Support complex bucket aggregations through dynamic transformation of OpenSearch aggregations to Vega-compatible format.
Implement actions to update Vega state based on user interactions (e.g., setVegaTooltip, setVegaAggs, setVegaTransforms, setVegaEncoding).
Modify the toExpression method to use Vega rendering when enabled.
Ensure seamless integration with existing OpenSearch Dashboards components and workflows.

2. Advanced setting to allow user to use vega to create visualizations in VisBuilder

This includes modifications in VisBuilder for each chart type to use either visualization expression or vega expression. The main purpose is to avoid any breaks for user experience. New controls will only be added in vega vis.

3.Easy migration from VisLib visualization created by VB to vega vis. Allow embed both visualizations in Dashboard .

Allow save vislib vis or vega vis: the only difference in the url is useVegaRendering value in style state which will decide whether use visualization expression or vega expression. when useVegaRendering is true, render vega in VisBuilder with toggle turned on.

/vis-builder/edit/471fa110-2ba8-11ef-b457-4707dd1c36d9#?
_q=(filters:!(),query:(language:kuery,query:''))&
_a=(metadata:(editor:(errors:(),state:loading)),
style:(addLegend:!t,addTooltip:!t,legendPosition:right,type:area,useVegaRendering:!f), // different part
ui:(),visualization:(activeVisualization:(aggConfigParams:!(),name:area),
indexPattern:ff959d40-b880-11e8-a6d9-e546fe2bba5f,searchField:''))&
_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15d,to:now))

Same as embedded to Dashboard: when saved with useVegaRendering to true, embed vega vis in Dashboard

4.More controls to line chart.(Optional) Use line chart as an example to integrate all the controls from line visualization to Vis-Builder line vega chart. Optional: add 1-2 new controls

Implementation Details regarding the VegaSpecBuilder Class

Method 1: Passing the Whole Aggregation (Aggs) as Input

Key Differences from Static Vega Spec Input:

Dynamic State Management: Utilize the visualization slice in Redux to manage Vega-specific state, enabling real-time updates based on user interactions.
Context Integration: Incorporate OpenSearch Dashboards context (e.g., index patterns, time ranges) directly into the Vega specification through the visualization state.
Data Retrieval: Leverage the modified opensearchaggs function to obtain aggregations directly, ensuring consistency with OpenSearch Dashboards' data model.
Data Transformation: Implement custom utility functions to dynamically generate Vega transforms based on the aggregation structure, providing more flexibility than static transforms.
Spec Generation: Dynamically construct the entire Vega specification using utility functions, allowing for on-the-fly adjustments based on user inputs and data changes.
Integration with Existing Tools: Seamlessly switch between traditional VisBuilder rendering and Vega rendering, maintaining compatibility with existing visualizations.

1. Extend the Visualization Slice for context integration

We'll extend the existing visualization slice to include Vega-specific state and actions:

import { createSlice, PayloadAction } from '@reduxjs/toolkit';
import { CreateAggConfigParams } from '../../../../../data/common';
import { VisBuilderServices } from '../../../types';
import { setActiveVisualization } from './shared_actions';

export interface VegaState {
  dataUrl: any;
  transforms: any[];
  encoding: any;
  aggs: any;
  indexPattern: string | null;
  metrics: any[];
  buckets: any[];
  tooltip: any;
  timeField: string;
  split: any[];
  group: any[];
  segment: any[];
  type: string;
  useVegaRendering: boolean;
}

export interface VisualizationState {
  indexPattern?: string;
  searchField: string;
  activeVisualization?: {
    name: string;
    aggConfigParams: CreateAggConfigParams[];
    draftAgg?: CreateAggConfigParams;
  };
  vega: VegaState;
}

// ... (keep existing initial state and preloaded state logic)

export const slice = createSlice({
  name: 'visualization',
  initialState,
  reducers: {
    // ... (keep existing reducers)

    // Add Vega-specific reducers
    setVegaTooltip: (state, action: PayloadAction<any>) => {
       state.vega.tooltip = action.payload;
    },
    setVegaAggs: (state, action: PayloadAction<any>) => {
       state.vega.aggs = action.payload;
    },
    setVegaTransforms: (state, action: PayloadAction<any[]>) => {
       state.vega.transforms = action.payload;
    },
    setVegaEncoding: (state, action: PayloadAction<any>) => {
       state.vega.encoding = action.payload;
    },
    // Add more Vega-specific reducers as needed
  },
  // ... (keep existing extra reducers)
});

// Export actions
export const {
  // ... (keep existing action exports)
  setVegaTooltip,
  setVegaAggs,
  setVegaTransforms,
  setVegaEncoding,
} = slice.actions;

2. Data Retrieval with proper aggregations: Utilize opensearchaggs to retrive aggs directly

Update the opensearchaggs function to return the constructed aggregations:

export const modifiedOpensearchaggs = () => ({
  // ... (keep existing properties)

  async fn(input, args, { inspectorAdapters, abortSignal }) {
    // ... (keep existing logic)

    // Return the constructed aggs using toDsl
    const constructedAggs = aggs.toDsl(args.metricsAtAllLevels);

    return constructedAggs;
  }
});

3. Data Transformation

Data transform in vega is done by transform. What it does is similar to tabifyAggResponse, which is aim to flatten nested structures for visualization. The main difference in approach is that tabifyAggResponse creates a complete tabular representation of the data, while the Vega transform provides a series of steps to transform the data on-the-fly during visualization rendering. This makes the Vega approach more memory-efficient and potentially faster for large datasets, as it doesn't need to materialize the entire flattened dataset in memory. Here is more comparation:

Output format: tabifyAggResponse produces a tabular format with rows and columns, while the Vega transform creates a series of steps to transform the data within Vega.
Naming conventions: tabifyAggResponse uses numeric IDs (e.g., 2-1) for column names, while the Vega transform uses more descriptive names based on the aggregation structure.
Handling of buckets: tabifyAggResponse creates separate rows for each bucket combination, while the Vega transform uses flatten operations to handle nested buckets.

Here we will add two utility functions

parseAggStructure: This function can recursively parses the aggregation structure to create a simplified representation.
generateTransform: This function generates the Vega transform steps based on the parsed aggregation

function parseAggStructure(aggs: any, path: string[] = []): any {
  const result: any = {};

  for (const [key, value] of Object.entries(aggs)) {
    if (key === 'buckets' && Array.isArray(value)) {
      result.buckets = value[0]; // Take the first bucket as a sample
    } else if (typeof value === 'object' && value !== null) {
      result[key] = parseAggStructure(value, [...path, key]);
    } else {
      result[key] = value;
    }
  }

  return result;
}

function generateTransform(aggStructure: any, aggs: any): any[] {
  const transform: any[] = [];
  const flattenStack: string[] = [];

  function getFieldName(aggId: string): string {
    return aggs[aggId]?.terms?.field || aggs[aggId]?.date_histogram?.field || aggId;
  }

  function traverse(obj: any, path: string[] = [], depth: number = 0) {
    for (const [key, value] of Object.entries(obj)) {
      if (key === 'buckets') {
        const parentKey = path[path.length - 1];
        const fieldName = getFieldName(parentKey);

        if (parentKey !== '2') { // Skip the top-level bucket agg
          transform.push({ calculate: `datum['${parentKey}']['buckets']`, as: fieldName });
          transform.push({ flatten: [fieldName] });
          flattenStack.push(fieldName);

          // Add key calculation
          transform.push({ calculate: `datum['${fieldName}'].key`, as: fieldName });
        } else {
          // For bucket agg, use 'key' directly
          transform.push({ calculate: "datum['key']", as: fieldName });
        }

        traverse(value, [...path, key], depth + 1);

        if (parentKey !== '2') {
          flattenStack.pop();
        }
      } else if (typeof value === 'object' && value !== null) {
        traverse(value, [...path, key], depth + 1);
      } else if (key === 'value' && depth === Object.keys(aggStructure).length - 1) {
        // Only add metric calculation for the deepest level
        const metricKey = path[path.length - 2];
        const fieldName = aggs[metricKey]?.avg?.field || `${metricKey}_value`;
        const parent = flattenStack[flattenStack.length - 1] || 'datum';
        transform.push({ calculate: `${parent}['${metricKey}']['value']`, as: `avg_${fieldName}` });
      }
    }
  }

  traverse(aggStructure);
  return transform;
}

Use these functions in the Vega utility functions in the next sub-section:

const buildTransforms = (aggs: any) => {
  const aggStructure = parseAggStructure(aggs);
  return generateTransform(aggStructure);
};

Example Result: Given the following aggregation:

"aggs": {
  "2": {
    "date_histogram": {
      "field": "timestamp",
      "fixed_interval": "12h",
      "time_zone": "America/Los_Angeles",
      "min_doc_count": 1
    },
    "aggs": {
      "3": {
        "terms": {
          "field": "geo.dest",
          "order": { "_count": "desc" },
          "size": 5
        },
        "aggs": {
          "1": {
            "avg": { "field": "bytes" }
          }
        }
      }
    }
  }
}

The datum structure would be:

{
  "3": {
    "buckets": [
      {
        "1": { "value": 5069.333333333333 },
        "key": "CN",
        "doc_count": 3
      },
      // ... other buckets
    ]
  },
  "key_as_string": "2024-06-30T12:00:00.000-07:00",
  "key": 1719774000000,
  "doc_count": 23
}

The generated transform would be:

[
  {
    "calculate": "datum['key']",
    "as": "timestamp"
  },
  {
    "calculate": "datum['3']['buckets']",
    "as": "geo.dest"
  },
  {
    "flatten": ["geo.dest"]
  },
  {
    "calculate": "datum['geo.dest'].key",
    "as": "geo.dest"
  },
  {
    "calculate": "datum['geo.dest']['1']['value']",
    "as": "avg_bytes"
  }
]

4. Create Vega Utility Functions

Create utility functions in a separate file:

// vegaUtils.ts

export const buildDataUrl = (indexPattern: string, timeField: string, aggs: any) => {
  return {
    context: true,
    timefield: timeField,
    index: indexPattern,
    body: {
      aggs: aggs,
      size: 0,
    },
  };
};

export const buildTransforms = (metrics: any[], buckets: any[]) => {
  // Implementation of buildTransforms logic
};

export const buildEncoding = (metrics: any[], buckets: any[], fieldsMap: any) => {
  // Implementation of buildEncoding logic
};

export const buildVegaSpec = (state: VisualizationState) => {
  const { vega } = state;
  const dataUrl = buildDataUrl(vega.specBuilder.indexPattern!, vega.specBuilder.timeField, vega.specBuilder.aggs);
  const transforms = buildTransforms(vega.specBuilder.metrics, vega.specBuilder.buckets);
  const encoding = buildEncoding(vega.specBuilder.metrics, vega.specBuilder.buckets, vega.specBuilder.fieldsMap);

  return {
    $schema: "https://vega.github.io/schema/vega-lite/v5.json",
    data: { url: dataUrl },
    transform: transforms,
    mark: { type: vega.specBuilder.type, point: true },
    encoding: encoding,
  };
};

5. Update toExpression Method

Modify the toExpression method to use the new utility functions:

const toExpression = async (params) => {
  const state = store.getState().visualization;
  if (state.vega.useVegaRendering) {
    const vegaSpec = buildVegaSpec(state);
    let vis = await createVis('vega', state.activeVisualization!.aggConfigParams, state.indexPattern!, params.searchContext);
    vis.params = {
      spec: JSON.stringify(vegaSpec),
    };

    const vega_expression = await buildPipeline(vis, {
      timefilter: params.timefilter,
      timeRange: params.timeRange,
      abortSignal: undefined,
      visLayers: undefined,
      visAugmenterConfig: undefined,
    });
    return vega_expression;
  }
  // ... (existing non-Vega rendering logic)
};

Method 2: Construct Aggs

Method 2 follows a similar structure to Method 1, but instead of passing the whole aggregation, it constructs the aggregation from individual components (metrics, segment, group, split). The main difference lies in the setVegaAggs reducer and the buildVegaSpec utility function:

// In the visualization slice
setVegaAggs: (state, action: PayloadAction<{metrics: any[], segment: any[], group: any[], split: any[]}>) => {
  const { metrics, segment, group, split } = action.payload;
  state.vega.specBuilder.metrics = metrics;
  state.vega.specBuilder.segment = segment;
  state.vega.specBuilder.group = group;
  state.vega.specBuilder.split = split;
  // Construct aggs from these components
  state.vega.specBuilder.aggs = constructAggs(metrics, segment, group, split);
},

// In vegaUtils.ts
export const constructAggs = (metrics: any[], segment: any[], group: any[], split: any[]) => {
  // Logic to construct aggs from individual components
};

Method 3: Passing Formatted Data to Vega Spec

This method involves passing pre-formatted data directly to the Vega spec. This method requires modifications to the buildVegaSpec function:

// In vegaUtils.ts
export const buildVegaSpec = (state: VisualizationState, formattedData: any[]) => {
  return {
    $schema: "https://vega.github.io/schema/vega-lite/v5.json",
    data: { values: formattedData },
    // ... other spec properties
  };
};

// In the component where the Vega spec is created
const formattedData = await getFormattedDataFromOpensearchaggs(/* params */);
const vegaSpec = buildVegaSpec(state, formattedData);

3. Pros and Cons

Method 1: Passing Whole Aggregation

Pros:
- Maintains consistency with existing aggregation structures
- Efficient for complex aggregations
Cons:
- Less flexibility for custom aggregations

Method 2: Construct Aggs

Pros:
- Offers more flexibility in aggregation construction
- Allows for fine-grained control over each aggregation component
Cons:
- More complex to implement and maintain
- Potential for inconsistencies if not carefully managed

Method 3: Passing Formatted Data

Pros:
- Simplifies the Vega spec
- Allows for pre-processing and custom data formatting
Cons:Cons:
- Potential memory issues with large datasets
- Less efficient for frequently updating data
- May not scale well for very large datasets

Conclusion

After considering all three methods, we decide proceeding with Method 1: Passing Whole Aggregation. This approach offers the best balance between maintaining consistency with existing OpenSearch Dashboards structures and providing efficient handling of complex aggregations. It avoids the potential scalability and performance issues of Method 3 while being less complex to implement and maintain than Method 2. Method 1 aligns well with the current OpenSearch Dashboards architecture and will likely provide the smoothest integration path for Vega visualizations within the existing framework. It also leaves room for future optimizations and extensions if needed.

How to Test / How to Make the Transfer Robust

To ensure the robustness and accuracy of the VegaSpecBuilder implementation, we should create a series of test cases that cover various combinations of metrics and buckets. These test cases will help verify that the VegaSpecBuilder can correctly handle different visualization configurations.

Test Cases

1 Metric 1 Bucket:
- 1 Metric + 1 Segment: Verify that the VegaSpecBuilder correctly sets the x-axis to the segment field and the y-axis to the metric field.
- 1 Metric + 1 Split: Ensure that the VegaSpecBuilder creates separate charts for each split value.
- 1 Metric + 1 Group: Check that the VegaSpecBuilder generates separate lines (or other marks) for each group value within a single chart.
2 Metrics 1 Bucket:
- 2 Metrics + 1 Segment: Confirm that the VegaSpecBuilder supports multiple y-axes for the different metrics while using the segment field for the x-axis.
- 2 Metrics + 1 Split: Ensure that separate charts are created for each split value, with each chart containing the two metrics.
- 2 Metrics + 1 Group: Verify that the VegaSpecBuilder generates separate lines (or other marks) for each group value within a single chart, displaying both metrics.
1 Metric 3 Buckets:
- 1 Metric + 1 Split + 1 Group + 1 Segment: Test that the VegaSpecBuilder correctly uses the segment field for the x-axis, creates separate lines (or other marks) for each group value, and generates separate charts for each split value.
1 Metric 4 Buckets:
- 1 Metric + 1 Split + 2 Group + 1 Segment
2 Metrics 4 Buckets:
- 2 Metrics + 1 Split + 2 Group + 1 Segment

Future Extension Discussion

Supporting Multiple Query Languages (DQL, PPL, SQL)

Extend the VegaSpecBuilder to handle different query languages:

buildPPlQuery() {
   this.pplQuery = ...
}

buildPPLQuerySpec(pplQuery) {
  return {
    data: {
      url: {
        index: this.indexPattern.title,
        body: {
          query: {
            source: {
              query: this.pplQuery,
            },
          },
          size: 0,
        },
      },
      format: this.format
    },
  };
}

buildSQLQuerySpec(sqlQuery) {
  ...
}

buildWithQuerySpec(queryType = 'dql', query = '') {
  let dataUrl;
  if (queryType === 'dql') {
    dataUrl = this.buildDataUrl();
  } else if (queryType === 'ppl') {
    dataUrl = this.buildPPLQuerySpec(query);
  } else if (queryType === 'sql') {
    dataUrl = this.buildSQLQuerySpec(query);
  }
  return build(this.data)
}

Handling Multiple Queries and Data Sources

Handle multiple queries and data sources by extending the buildVegaSpec method:

buildMultiQuerySpec(queries) {
  this.dataWithMultipleQuery = this.queries.map((query, index) => ({
    name: `data${index + 1}`,
    url: {
      index: this.indexPattern.title,
      body: {
        query: query.format === 'ppl' ? {
          source: {
            query: this.buildPPLQuery(),
          },
        } : {
          sql: {
            query: this.buildSQLQuery(),
          },
        },
        size: 0,
      },
    },
    format: this.format
  }));

  return build(this.dataWithMultipleQuery)

2.16 Timeline and Task BreakDowns

[ ] Integrate vega vis in VisBuilder
[ ] Convert existing vis charts to vega in VisBuilder
[ ] Add more controls for line chart in VisBuilder

FAQ

YANG-DB commented 4 months ago

@ananzh very nice ! I would add another important capability is to allow the community to contribute generic vis-tool as part of the out of the box vis tools catalog

YANG-DB commented 4 months ago

I strongly recommend reviewing the vega-altair engine used to do this same transformation from a high level language (python) into the vega spec (json)

YANG-DB commented 4 months ago

Another suggestion is to integration the existing opensource vega-editor to replace our existing vega json editor to simplify the actual vega editing for advanced vis- builders

ashwin-pc commented 4 months ago

zooming in and out

This exists in the tool today.

Toggle in VisBuilder to allow user to display vislib vis or vega vis in VisBuilder, to save as vislib vis or vega vis and to embed either vislib vis or vega vis in Dashboard .

We should not have a toggle in the UI since for most users Vega is an implementation detail. Only advanced users would care about it. If we want to maintain the expereince for users, we should either try to match the experience or keep an advanced settings toggle to allow the user to go back to the older expereince.

A new vega type vis directly in VisBuilder

Why do we need this as opposed to just redirecting the user to the vega editor? if we do it this way, we should allow the user to switch back and carry context from vega back to the other chart types. Right now if i switch between line and bar and go back to line, the line chart carries over the changes that t can from the bar chart. With this vega type can we do that?

VegaSpecBuilder Class

In this class you are also constructing the query but its very secific to DSL. how would this work with PPL and SQL? They each support a limited subset aggregations and does not support all the agg types.

Supporting Multiple Query Languages (DQL, PPL, SQL)

if we arent integrating VisBuilder into Discover, we might not need this. Would like to hear from the others about this, but my reasoning is that the user never has to enter the query that is used to fetch the data from the backend. If thats the case, the language we use under the hood does not matter. The only exception to this being datasources that dont support visualizations in other languages. In that scenario id like this to be a little more modular so that when other languages are added, its not on the VisType to manually update itself to support all the new languages.

One approach here could be to allow the VisType to specify which languages it supports so that they all have to support DQL by default but can optionally specify which other languages they support. But what would be even nicer is if the VisType did not have to know anything about the language used under the hood and only worried about the dataframe that cameback and mapped it to the Vis, leaving the query language part to the framework. But this might be trickier

virajsanghvi commented 4 months ago

Is the problem to solve reflected in the requirements? If so, why is this the best way to solve this?
What is expectations of migrated visuals? Should they look exactly like they did pre-vega?
VegaSpecBuilder - a little unclear on how this fits in at a high level
Do we want a toggle for Vega Light vs just have everything render that way?
How do you get to Vega Vis type from other visualizations?
What features do we want to add to line chart? Can we be specific? - is this specific to vega integration?
Are there more charts than just Pie? - Should visual types be part of this if its specific to vega integration?
"1.Minimum Changed Customer Experience" - what is changing?
Is VegaSpecBuilder the state of the configuration? Or does it just operate one way (config -> vega spec)?
Should the builder be in state or the vega spec? The builder pattern seems to mutate state vs rely on building a new spec.
Why be able to set the state explicitly and set particular options? Why not take one approach or the other?

virajsanghvi commented 4 months ago

VegaSpecBuilder - for building queries - should different languages contribute definition on how to build the query? Do we want these centrally located.
Can you summarize what the alternative proposal was in comparison to the specbuilder?

ananzh commented 4 months ago

A hard code mapping for demo purpose

export const createVegaSpec = (styleState, dimensions, valueAxes, aggConfigs, indexPattern, searchContext) => {
  const { addLegend, addTooltip, type } = styleState;
  const { x, y } = dimensions;
  const index = indexPattern.title;
  const timeField = searchContext.timeRange ? searchContext.timeRange.field : "@timestamp"; // Use the time range field or default to "@timestamp"

  const dateHistogram = aggConfigs.aggs.find(agg => agg.schema === 'segment');
  const metric = aggConfigs.aggs.find(agg => agg.schema === 'metric');
  const metricType = metric.type.name;

  const dataUrl = {
    context: true,
    timefield: timeField,
    index: index,
    body: {
      aggs: {
        1: {
          date_histogram: {
            field: dateHistogram.params.field.displayName,
            fixed_interval: "3h", // hard coded for now
            time_zone: "America/Los_Angeles", // can be dynamic if required
            min_doc_count: dateHistogram.params.min_doc_count,
            extended_bounds: dateHistogram.params.extended_bounds,
          },
          aggs: {
            2: {
              [metricType]: {
                field: metric.params.field.displayName
              }
            }
          }
        }
      },
      size: 0
    }
  };

  const vegaSpec = {
    $schema: "https://vega.github.io/schema/vega-lite/v5.json",
    data: {
      url: dataUrl,
      format: {
        property: "aggregations.1.buckets"
      }
    },
    transform: [
      {
        calculate: "datum.key",
        as: "timestamp"
      },
      {
        calculate: `datum[2].value`,
        as: metric.params.field.displayName
      }
    ],
    layer: [
      {
        mark: {
          type: "line" // or dynamic type if needed
        }
      },
      {
        mark: {
          type: "circle",
          tooltip: addTooltip
        }
      }
    ],
    encoding: {
      x: {
        field: "timestamp",
        type: "temporal",
        axis: {
          title: timeField
        }
      },
      y: {
        field: metric.params.field.displayName,
        type: "quantitative",
        axis: {
          title: metric.params.field.displayName
        }
      },
      color: {
        datum: metric.params.field.displayName,
        type: "nominal"
      }
    }
  };

  if (addLegend) {
    vegaSpec.encoding.color.legend = {
      title: metric.params.field.displayName
    };
  }

  return vegaSpec;
};

virajsanghvi commented 4 months ago

Can you speak to the difference of the options? I'm not really sure from reading

From method 1: cons

which might not be flexible for dynamic changes.

Are there specific cases you're worried about?

we should create a series of test cases that cover various combinations of metrics and buckets

Just to be clear, we should have test cases for all known combinations, right? And can we prevent unknown combos from being used in the product in some way?

Also, do we clearly understand the expected input/output of these cases?

VegaSpecBuilder

Should we be storing unserializable state in redux?

Also, building the spec is calculated state, is this the right thing to store?

ashwin-pc commented 4 months ago

Create a vega slice

Why do we need a slice? slices are for state that needs to be stored globally and accessed across the app. The Vega spec is only needed by the Visualization right? cant we just create the spec there?

Send modular API to update VegaBuilder Class

Do we need to update both the slice and the aggconfig? or can we update just the aggconfig? My assumption was that the spec could be constructed whenever we want using the style state and the agg config.

Separate buckets Both methods need to separate bucket aggregations into distinct categories: group, split, and segment. This separation is necessary because each type of aggregation serves a different purpose in the visualization:

Can you give a little more details about this. Not sure i fully understood why we need this.

VegaSpecBuilder

How does this work for different Vistypes? dont the encodings and specs change between vistypes? e.g. pie and bar chart will encode the chart differently right?

const vegaSpecBuilder = useTypedSelector(state => state.vega.specBuilder);

State should not be used to retrieve a function. Why cant vegaSpecBuilder be a simple function?

The Difference

In this section i didnt understand the difference between the two methods. What is method 2? I didnt understand the pro's and cons of each approach to know which ones better. An example might help.'

Overall, the approach here could benifit from a block diagram explaining how the flow works as the information is passed across the various components

anirudha commented 4 months ago

| if we arent integrating VisBuilder into Discover

How will sql/ ppl users build visualization?

How will discover IA for visualizations be handled with multiple languages support ?

How will we achieve the cohesion tenet without sql / ppl support for visualizations

opensearch-project / OpenSearch-Dashboards

Integrate Vega Vis into VisBuilder Proposal #7067

Background

Requirements and Considerations

Requirements

Considerations and Optimizations

Out of Scope

Current Workflow

VisLib in VisBuilder Workflow

Vega Vis Workflow

Proposed Design

Key Deliveries for 2.16

Implementation Details regarding the VegaSpecBuilder Class

Method 1: Passing the Whole Aggregation (Aggs) as Input

1. Extend the Visualization Slice for context integration

2. Data Retrieval with proper aggregations: Utilize opensearchaggs to retrive aggs directly

3. Data Transformation

4. Create Vega Utility Functions

5. Update toExpression Method

Method 2: Construct Aggs

Method 3: Passing Formatted Data to Vega Spec

3. Pros and Cons

Method 1: Passing Whole Aggregation

Method 2: Construct Aggs

Method 3: Passing Formatted Data

Conclusion

How to Test / How to Make the Transfer Robust

Future Extension Discussion

Supporting Multiple Query Languages (DQL, PPL, SQL)

Handling Multiple Queries and Data Sources

2.16 Timeline and Task BreakDowns

FAQ