nyurik / kibana-vega-vis

This Kibana plugin allows any data visualizations from Elastic Search and other data sources using Vega grammar. You can even create a visualization on top of an interactive map.
Apache License 2.0
134 stars 31 forks source link

Boxplot color is overwritten #57

Closed emilmirzayev closed 6 years ago

emilmirzayev commented 6 years ago

I intended to make a boxplot based on Elasticsearch data. I added a scale, mapped it to my categorical field and also added a legend based on colors. Despite legend showing the colors and mapping correctly, Box-plots themselves are not colored, instead overwritten with green. I will post my Vega query, results, and the image of the boxplot: 1) My Vega code

  "$schema": "https://vega.github.io/schema/vega/v3.json",
  "title": "Average delivery delay per meter type",
  "signals": [
    {"name": "plotWidth", "value": 100},
    {"name": "height", "value": 300},
    {
      "name": "tooltip",
      "value": {},
      "on": [
        {"events": "rect:mouseover", "update": "datum"},
        {"events": "rect:mouseout", "update": "{}"}
      ]
    }
  ],
  "data": [
    {
      "name": "results",
      "url": {
        "index": "vee_statistics",
        "%context%": true,
        "%timefield%": "date",
        "body": {
          "size": 0,
          "aggs": {
            "completeness_per_meter_type": {
              "terms": {"field": "meter_type", "size": 10},
              "aggs": {
                "completeness_quantile": {
                  "percentiles": {
                    "field": "avg_delivery_delay",
                    "percents": [25, 50, 75]
                  }
                },
                "min_value": {
                  "min": {"field": "avg_delivery_delay"}
                },
                "max_value": {
                  "max": {"field": "avg_delivery_delay"}
                }
              }
            }
          }
        }
      },
      "format": {
        "property": "aggregations.completeness_per_meter_type.buckets"
      },
      "transform": [
        {
          "type": "formula",
          "expr": "datum.completeness_quantile.values['25.0']",
          "as": "q1"
        },
        {
          "type": "formula",
          "expr": "datum.completeness_quantile.values['50.0']",
          "as": "median"
        },
        {
          "type": "formula",
          "expr": "datum.completeness_quantile.values['75.0']",
          "as": "q3"
        },
        {
          "type": "formula",
          "expr": "datum.min_value.value",
          "as": "min_value"
        },
        {
          "type": "formula",
          "expr": "datum.max_value.value",
          "as": "max_value"
        },
        {
          "type": "fold",
          "fields": [
            "min_value",
            "q1",
            "median",
            "q3",
            "max_value"
          ],
          "as": ["metric", "metricValue"]
        }
      ]
    }
  ],
  "scales": [
    {
      "name": "layout",
      "type": "band",
      "range": "height",
      "domain": {"data": "results", "field": "key"}
    },
    {
      "name": "xscale",
      "type": "linear",
      "range": "width",
      "round": true,
      "domain": {"data": "results", "field": "metricValue"},
      "zero": true,
      "nice": true
    },
    {
      "name": "color",
      "type": "ordinal",
      "domain": {"data": "results", "field": "key"},
      "range": {"scheme": "category20"}
    }
  ],
  "legends":[
    {
      stroke: "color"
      "title": "Meter type"
    }
  ],
  "axes": [
    {
      "orient": "bottom",
      "scale": "xscale",
      "zindex": 1,
      "tickCount": 5,
      "title": "Delay in minutes"
    },
    {
      "orient": "left",
      "scale": "layout",
      "tickCount": 4,
      "zindex": 1,
      "title": "Meter type"
    }
  ],
  "marks": [
    {
      "type": "group",
      "from": {
        "facet": {
          "data": "results",
          "name": "meters",
          "groupby": "key"
        }
      },
      "encode": {
        "enter": {
          "yc": {
            "scale": "layout",
            "field": "key",
            "band": 0.5
          },
          "height": {"signal": "plotWidth"},
          "width": {"signal": "width"}
        }
      },
      "data": [
        {
          "name": "summary",
          "source": "meters",
          "transform": [
            {
              "type": "aggregate",
              "fields": [
                "metricValue",
                "metricValue",
                "metricValue",
                "metricValue",
                "metricValue"
              ],
              "ops": ["min", "q1", "median", "q3", "max"],
              "as": ["min", "q1", "median", "q3", "max"]
            }
          ]
        }
      ],
      "marks": [
        {
          "type": "rect",
          "from": {"data": "summary"},
          "encode": {
            "enter": {
              "fill": {"value": "black"},
              "height": {"value": 1}
            },
            "update": {
              "yc": {
                "signal": "plotWidth/2",
                "offset": -0.5
              },
              "x": {"scale": "xscale", "field": "min"},
              "x2": {"scale": "xscale", "field": "max"}
            }
          }
        },
        {
          "type": "rect",
          "from": {"data": "summary"},
          "encode": {
            "enter": {
              "fill": {"scale": "color", "field": "key"},
              "cornerRadius": {"value": 10},

              "yc": {"signal": "plotWidth / 2"},
              "height": {"signal": "plotWidth / 2"},
              "x": {"scale": "xscale", "field": "q1"},
              "x2": {"scale": "xscale", "field": "q3"}
            }
          }
        },
        {
          "type": "rect",
          "from": {"data": "summary"},
          "encode": {
            "enter": {
              "fill": {"value": "black"},
              "width": {"value": 2}
            },
            "update": {
              "yc": {"signal": "plotWidth / 2"},
              "height": {"signal": "plotWidth / 2"},
              "x": {"scale": "xscale", "field": "median"}
            }
          }
        },
        {
          "type": "text",
          "encode": {
            "enter": {
              "align": {"value": "center"},
              "baseline": {"value": "bottom"},
              "fill": {"value": "#444"}
            },
            "update": {
              "x": {
                "scale": "xscale",
                "signal": "tooltip.metric",
                "band": 0.5
              },
              "text": {"signal": "tooltip.metricValue"},
              "fillOpacity": [
                {"test": "datum === tooltip", "value": 0},
                {"value": 1}
              ]
            }
          }
        }
      ]
    }
  ]
}

2)My Elasticsearch query results

    "completeness_per_meter_type": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Landis + Gyr E350",
          "doc_count": 11712,
          "min_value": {
            "value": 0
          },
          "completeness_quantile": {
            "values": {
              "25.0": 0,
              "50.0": 777946.6999389499,
              "75.0": 1636599.4181818184
            }
          },
          "max_value": {
            "value": 2406472
          }
        },
        {
          "key": "Diehl/Hydrometer Hydrus DN 25",
          "doc_count": 7547,
          "min_value": {
            "value": 0
          },
          "completeness_quantile": {
            "values": {
              "25.0": 267769.1359104414,
              "50.0": 960271.0482142858,
              "75.0": 1735614.9509373924
            }
          },
          "max_value": {
            "value": 2408339
          }
        },
        {
          "key": "Inhemeter DTZ 1513i",
          "doc_count": 6199,
          "min_value": {
            "value": 0
          },
          "completeness_quantile": {
            "values": {
              "25.0": 168858.30629629627,
              "50.0": 860171.7062937064,
              "75.0": 1637965.7174185463
            }
          },
          "max_value": {
            "value": 2405565
          }
        },
        {
          "key": "DEMO Meter",
          "doc_count": 790,
          "min_value": {
            "value": 0
          },
          "completeness_quantile": {
            "values": {
              "25.0": 7.000000000000001,
              "50.0": 10,
              "75.0": 33.63333333333333
            }
          },
          "max_value": {
            "value": 912912.75
          }
        }
      ]
    }
  }

3) My visualization image

nyurik commented 6 years ago

@emilmirzayev hi, i combined your example to try it (please submit questions in this form, makes it much easier), but I don't see bars, I only see vertical lines. Could you adjust your example to show it? Also, I removed dynamic height setting - there is no need for it because Kibana's Vega dynamically adjusts the graph size based on the container size. Also, there is a problem with the graph's resizing in general - it might be a bug in Vega (I think I saw something similar to that before) -- you may have to rename min_value and max_value to something else when you copy them from min_value.value. Just use a different as param, and rename all the usages.

image

{
  "$schema": "https://vega.github.io/schema/vega/v3.json",
  "title": "Average delivery delay per meter type",
  "signals": [
    {"name": "plotWidth", "value": 100},
    {
      "name": "tooltip",
      "value": {},
      "on": [
        {"events": "rect:mouseover", "update": "datum"},
        {"events": "rect:mouseout", "update": "{}"}
      ]
    }
  ],
  "data": [
    {
      "name": "results",
      "values": {
        "aggregations": {
          "completeness_per_meter_type": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "Landis + Gyr E350",
                "doc_count": 11712,
                "min_value": {"value": 0},
                "completeness_quantile": {
                  "values": {
                    "25.0": 0,
                    "50.0": 777946.6999389499,
                    "75.0": 1636599.4181818184
                  }
                },
                "max_value": {"value": 2406472}
              },
              {
                "key": "Diehl/Hydrometer Hydrus DN 25",
                "doc_count": 7547,
                "min_value": {"value": 0},
                "completeness_quantile": {
                  "values": {
                    "25.0": 267769.1359104414,
                    "50.0": 960271.0482142858,
                    "75.0": 1735614.9509373924
                  }
                },
                "max_value": {"value": 2408339}
              },
              {
                "key": "Inhemeter DTZ 1513i",
                "doc_count": 6199,
                "min_value": {"value": 0},
                "completeness_quantile": {
                  "values": {
                    "25.0": 168858.30629629627,
                    "50.0": 860171.7062937064,
                    "75.0": 1637965.7174185463
                  }
                },
                "max_value": {"value": 2405565}
              },
              {
                "key": "DEMO Meter",
                "doc_count": 790,
                "min_value": {"value": 0},
                "completeness_quantile": {
                  "values": {
                    "25.0": 7.000000000000001,
                    "50.0": 10,
                    "75.0": 33.63333333333333
                  }
                },
                "max_value": {"value": 912912.75}
              }
            ]
          }
        }
      },
      "format": {
        "property": "aggregations.completeness_per_meter_type.buckets"
      },
      "transform": [
        {
          "type": "formula",
          "expr": "datum.completeness_quantile.values['25.0']",
          "as": "q1"
        },
        {
          "type": "formula",
          "expr": "datum.completeness_quantile.values['50.0']",
          "as": "median"
        },
        {
          "type": "formula",
          "expr": "datum.completeness_quantile.values['75.0']",
          "as": "q3"
        },
        {
          "type": "formula",
          "expr": "datum.min_value.value",
          "as": "min_value"
        },
        {
          "type": "formula",
          "expr": "datum.max_value.value",
          "as": "max_value"
        },
        {
          "type": "fold",
          "fields": [
            "min_value",
            "q1",
            "median",
            "q3",
            "max_value"
          ],
          "as": ["metric", "metricValue"]
        }
      ]
    }
  ],
  "scales": [
    {
      "name": "layout",
      "type": "band",
      "range": "height",
      "domain": {"data": "results", "field": "key"}
    },
    {
      "name": "xscale",
      "type": "linear",
      "range": "width",
      "round": true,
      "domain": {"data": "results", "field": "metricValue"},
      "zero": true,
      "nice": true
    },
    {
      "name": "color",
      "type": "ordinal",
      "domain": {"data": "results", "field": "key"},
      "range": {"scheme": "category20"}
    }
  ],
  "legends": [{"stroke": "color", "title": "Meter type"}],
  "axes": [
    {
      "orient": "bottom",
      "scale": "xscale",
      "zindex": 1,
      "tickCount": 5,
      "title": "Delay in minutes"
    },
    {
      "orient": "left",
      "scale": "layout",
      "tickCount": 4,
      "zindex": 1,
      "title": "Meter type"
    }
  ],
  "marks": [
    {
      "type": "group",
      "from": {
        "facet": {
          "data": "results",
          "name": "meters",
          "groupby": "key"
        }
      },
      "encode": {
        "enter": {
          "yc": {"scale": "layout", "field": "key", "band": 0.5},
          "height": {"signal": "plotWidth"},
          "width": {"signal": "width"}
        }
      },
      "data": [
        {
          "name": "summary",
          "source": "meters",
          "transform": [
            {
              "type": "aggregate",
              "fields": [
                "metricValue",
                "metricValue",
                "metricValue",
                "metricValue",
                "metricValue"
              ],
              "ops": ["min", "q1", "median", "q3", "max"],
              "as": ["min", "q1", "median", "q3", "max"]
            }
          ]
        }
      ],
      "marks": [
        {
          "type": "rect",
          "from": {"data": "summary"},
          "encode": {
            "enter": {
              "fill": {"value": "black"},
              "height": {"value": 1}
            },
            "update": {
              "yc": {"signal": "plotWidth/2", "offset": -0.5},
              "x": {"scale": "xscale", "field": "min"},
              "x2": {"scale": "xscale", "field": "max"}
            }
          }
        },
        {
          "type": "rect",
          "from": {"data": "summary"},
          "encode": {
            "enter": {
              "fill": {"scale": "color", "field": "key"},
              "cornerRadius": {"value": 10},
              "yc": {"signal": "plotWidth / 2"},
              "height": {"signal": "plotWidth / 2"},
              "x": {"scale": "xscale", "field": "q1"},
              "x2": {"scale": "xscale", "field": "q3"}
            }
          }
        },
        {
          "type": "rect",
          "from": {"data": "summary"},
          "encode": {
            "enter": {
              "fill": {"value": "black"},
              "width": {"value": 2}
            },
            "update": {
              "yc": {"signal": "plotWidth / 2"},
              "height": {"signal": "plotWidth / 2"},
              "x": {"scale": "xscale", "field": "median"}
            }
          }
        },
        {
          "type": "text",
          "encode": {
            "enter": {
              "align": {"value": "center"},
              "baseline": {"value": "bottom"},
              "fill": {"value": "#444"}
            },
            "update": {
              "x": {
                "scale": "xscale",
                "signal": "tooltip.metric",
                "band": 0.5
              },
              "text": {"signal": "tooltip.metricValue"},
              "fillOpacity": [
                {"test": "datum === tooltip", "value": 0},
                {"value": 1}
              ]
            }
          }
        }
      ]
    }
  ]
}
emilmirzayev commented 6 years ago

Hi Yuri, this is very very weird. I just copy-pasted your example, but I could get the bars, again in green (?!). I tried to use Debug mode but couldnt access the Data which is defined within Mark. My Kibana version is 6.2.2 and I am using ELK on AWS. This is what I see: image Which further steps should/can I take?

nyurik commented 6 years ago

@emilmirzayev thanks for reporting, this does appear to be a regression in Vega (or possibly some unintended behavior that got fixed, but now it's breaking your graph). I will post more as soon as I know what's going on. P.S. You can try the graph on https://vega.github.io/editor/

nyurik commented 6 years ago

@emilmirzayev turns out both problems were related. You forgot to aggregate by key using "groupby": ["key"] in the "summary" data. As the result, you had no datum.key, hence the color calc was failing. In the newer Vega it was handling it in a more correct way - not drawing broken rect.

There is a trick I often use to debug Vega autogenerated code. I open Chrome debugger and enable breaking on errors, and afterwards I change the field name to something broken, e.g. key.abc.xyz instead of key. It will always produce an error, because even if key exists, abc.xyz does not, thus causing an error. Debugger would stop, and show me the auto-generated code, where I can investigate what value is available on the datum object. That's where I saw that there was no key field.