opensearch-project / opensearch-catalog

The OpenSearch Catalog is designed to make it easier for developers and community to contribute, search and install artifacts like plugins, visualization dashboards, ingestion to visualization content packs (data pipeline configurations, normalization, ingestion, dashboards).
Apache License 2.0
20 stars 19 forks source link

[FEATURE]Support Vega Overlay aggregations #138

Open YANG-DB opened 6 months ago

YANG-DB commented 6 months ago

Is your feature request related to a problem? For many vega components showing data-point graphs or time-series based aggregation, its interesting to compare the selected time range with previously time bucket -

Screenshot 2024-03-08 at 3 46 26 PM

Do you have any additional context?

The next vega data attribute demonstrates how this can be achievable using the following data parts:

Each data query is decorated with the time relevancy (Old,Current) transformation to indicate its origin

  "data": [
    {
      "name": "rawdata-old",
      "url": {
        "index": "otel-v1-apm-span-*",
        "body": {
          "query": {
            "bool": {
              "must": [
                "%dashboard_context-must_clause%"
                {
                  "range": {
                    "startTime": {
                      "%timefilter%": true,
                      "shift": 1,
                      "unit": "hour"
                    }
                  }
                }
              ]
              "must_not": [
                "%dashboard_context-must_not_clause%"
              ],
              "filter": [
                "%dashboard_context-filter_clause%"
              ]
            }
          }
          "aggs": {
            "services": {
              "terms": {
                "field": "serviceName",
                "size": 15
              },
              "aggs": {
                "time_buckets": {
                  "date_histogram": {
                    "field": "startTime",
                    "interval": {"%autointerval%": true},
                    "extended_bounds": {
                      "min": {"%timefilter%": "min"},
                      "max": {"%timefilter%": "max"}
                    },
                    "min_doc_count":0
                  },
                  "aggs": {
                    "duration": {
                      "avg": {
                        "missing": 0,
                        "script": {
                          "source": "!doc.containsKey('durationInNanos') || doc['durationInNanos'].empty ? 0 : doc['durationInNanos'].value / 1000000.0",
                          "lang": "painless"
                        }
                      }
                    }
                  }
                }
              }
            }
          },
          "size": 0
        }
      },
      "format": {"property": "aggregations.services.buckets"},
      "transform": [
        {"type": "formula", "expr": "'Old'", "as": "source"}
      ]
    },
    {
      "name": "rawdata",
      "url": {
        "index": "otel-v1-apm-span-*",
        "%context%": true,
        "%timefield%": "startTime",
        "body": {
          "aggs": {
            "services": {
              "terms": {
                "field": "serviceName",
                "size": 15
              },
              "aggs": {
                "time_buckets": {
                  "date_histogram": {
                    "field": "startTime",
                    "interval": {"%autointerval%": true},
                    "extended_bounds": {
                      "min": {"%timefilter%": "min"},
                      "max": {"%timefilter%": "max"}
                    },
                    "min_doc_count":0
                  },
                  "aggs": {
                    "duration": {
                      "avg": {
                        "missing": 0,
                        "script": {
                          "source": "!doc.containsKey('durationInNanos') || doc['durationInNanos'].empty ? 0 : doc['durationInNanos'].value / 1000000.0",
                          "lang": "painless"
                        }
                      }
                    }
                  }
                }
              }
            }
          },
          "size": 0
        }
      },
      "format": {"property": "aggregations.services.buckets"},
      "transform": [
        {"type": "formula", "expr": "'Current'", "as": "source"}
      ]
    },
    {
      "name": "flatdata",
      "source": ["rawdata", "rawdata-old"],
      "transform": [
        {
          "type": "flatten",
          "fields": ["time_buckets.buckets"],
          "as": ["val"]
        },
        {
          "type": "formula",
          "as": "timeInMs",
          "expr":"datum.val.key"
        },
        {
          "type": "formula",
          "as": "count",
          "expr":"datum.val.doc_count"
        },
        {
          "type": "formula",
          "as": "duration",
          "expr": "datum.val.duration.value == null ? 0 : datum.val.duration.value"
        },
        {
          "type": "formula",
          "as": "time",
          "expr": "timeFormat(utcParse(datum.val.key_as_string,'%Y-%m-%dT%H:%M:%S.%LZ'), '%B %d, %Y %H:%M')"
        }
      ]
    },
   ....
  ]

For rendering both the current data and old data the marks should consider the decorated source field

"marks": [
  {
    "type": "symbol",
    "from": {"data": "flatdata"},
    "encode": {
      "update": {
        "x": {"scale": "xScale", "field": "time"},
        "y": {"scale": "yScale", "field": "count"},
        "fill": {
          "condition": {"test": "datum.dataType === 'Old'", "value": "red"},
          "value": "blue"
        },
        "tooltip": {
          "signal": "{'Date': datum.time, 'Count': datum.count, 'Duration': datum.duration, 'Type': datum.dataType}"
        }
      }
    }
  }
]