vega / vega-lite

A concise grammar of interactive graphics, built on Vega.
https://vega.github.io/vega-lite/
BSD 3-Clause "New" or "Revised" License
4.67k stars 608 forks source link

Support Contour Plot #1919

Open kanitw opened 7 years ago

kanitw commented 7 years ago

Vega 3 supports it already and it seems like a kind of plots very useful for EDA.

domoritz commented 7 years ago

How is this different from a binned chart? Maybe with a smooth line.

jheer commented 7 years ago

It is related but fundamentally different than a binned chart - kernel density estimation (KDE) reconstructs a continuous distribution from observed discrete samples. Vega 3 supports it, as well as underlying mechanisms for fitting and plotting other continuous theoretical distributions (e.g., normal). These could be useful for EDA. See the violin plots example in the Vega 3 repo for one use case that Altair users (for example) might benefit from.

domoritz commented 6 years ago
screen shot 2018-01-25 at 13 44 36

I experimented with contour plots a bit last summer. While they look great, there are some challenges with overlapping contours if the data is faceted. I could not even use pure Vega when I wanted multiple contours to share the same color scale. The issue is that we don't compute the density independently from the lines.

kanitw commented 6 years ago

File a bug in Vega?

jheer commented 6 years ago

Not a bug, but perhaps a feature request, along with detailed descriptions of the required functionality. IIRC this might require an additional operation that calculates levels for each faceted contour and then performs aggregation to determine a scale range?

domoritz commented 6 years ago

Yes, this is not a bug in Vega but a missing feature. Contour plots in Vega use a single transform that computes density and the contour lines in one step. If we break these steps apart, we can compute the range of densities and share the same color scale across contours. There may be more to this (as well as some design work on the Vega-Lite side) so that I don't feel confident to file an issue with Vega yet.

jheer commented 6 years ago

The density + contour calculation is "outsourced" to d3-contour, which computes the density and then outputs contours as GeoJSON: https://github.com/d3/d3-contour/blob/master/src/density.js

This method internally performs KDE and invokes contour generation. As is helpful, one might contribute a PR that breaks those steps apart into subroutines, e.g., one that produces a blurred (KDE) value grid and one that produces shared contour threshold values given one or more grids.

Ideally Vega would avoid internal duplication of effort, e.g., a collection of contours might be created once via a contour-specific groupby aggregate transform and then faceted by groupby keys.

kanitw commented 6 years ago

Would the faceting issue be a problem for violin/density plot as well?

Note that I'm adding a separate issue for violin / density plot in https://github.com/vega/vega-lite/issues/3442, and keep this issue mainly for contour plot.

domoritz commented 6 years ago

@kanitw I don't think so because the density can be computed separately.

kanitw commented 6 years ago

@zening just mentioned today that we can "hack" and use contour to highlight point as a way to annotate plots too. (She will post a screenshot.)

light-and-salt commented 6 years ago

For my consistency project, I sometimes want to highlight a cluster of marks. For example, in the mockup below, my "console" prints a warning that says the blue in the color palette of view 2 conflicts with the blue in view 1. When the user hovers on the text warning, I want to highlight all marks in view 1, and the blue in view 2 palette. (Ideally I also want to draw a line connecting the two areas in conflict to make it explicit that the warning concerns two areas.)

image

To highlight the mark cluster in view 1, I want to draw a "cloud" that loosely fit the shape of the mark cluster. I tried to hack the vega contour example to do this:

screen shot 2018-03-05 at 7 49 57 pm

{
  "$schema": "https://vega.github.io/schema/vega/v3.json",
  "width": 500,
  "height": 400,
  "padding": 5,
  "autosize": "pad",

  "data": [
    {
      "name": "source",
      "url": "data/cars.json",
      "transform": [
        {
          "type": "filter",
          "expr": "datum['Horsepower'] != null && datum['Miles_per_Gallon'] != null"
        }
      ]
    },
    {
      "name": "contours",
      "source": "source",
      "transform": [
        {
          "type": "contour",
          "x": {"expr": "scale('x', datum.Horsepower)"},
          "y": {"expr": "scale('y', datum.Miles_per_Gallon)"},
          "size": [{"signal": "width"}, {"signal": "height"}],
          "thresholds": [0, 0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008]
        }
      ]
    }
  ],

  "scales": [
    {
      "name": "x",
      "type": "linear",
      "round": true,
      "nice": true,
      "zero": false,
      "domain": {"data": "source", "field": "Horsepower"},
      "range": "width"
    },
    {
      "name": "y",
      "type": "linear",
      "round": true,
      "nice": true,
      "zero": false,
      "domain": {"data": "source", "field": "Miles_per_Gallon"},
      "range": "height"
    },
    {
      "name": "color",
      "type": "sequential",
      "zero": true,
      "domain": {"data": "contours", "field": "value"},
      "range": ["white", "#3ae0ae", "white", "white", "white", "white", "white", "white", "white"]
    }
  ],

  "marks": [
    {
      "type": "path",
      "from": {"data": "contours"},
      "encode": {
        "enter": {
          "stroke": {"value": "#888"},
          "strokeWidth": {"value": 0},
          "fill": {"scale": "color", "field": "value"},
          "fillOpacity": {"value": 0.35}
        }
      },
      "transform": [
        { "type": "geopath", "field": "datum" }
      ]
    },
    {
      "name": "marks",
      "type": "symbol",
      "from": {"data": "source"},
      "encode": {
        "update": {
          "x": {"scale": "x", "field": "Horsepower"},
          "y": {"scale": "y", "field": "Miles_per_Gallon"},
          "size": {"value": 4},
          "fill": [
            {"test": "true", "value": "gray"},
            {"value": "transparent"}
          ]
        }
      }
    }
  ]
}
light-and-salt commented 6 years ago

Can hack contour to highlight a single data point (or a subset of points) as well:

screen shot 2018-03-05 at 8 19 55 pm

{
  "$schema": "https://vega.github.io/schema/vega/v3.json",
  "width": 500,
  "height": 400,
  "padding": 5,
  "autosize": "pad",

  "data": [
    {
      "name": "source",
      "url": "data/cars.json",
      "transform": [
        {
          "type": "filter",
          "expr": "datum['Horsepower'] != null && datum['Miles_per_Gallon'] != null"
        }
      ]
    },
    {
      "name": "contours",
      "source": "source",
      "transform": [
        {
          "type": "contour",
          "x": {"expr": "scale('x', 132)"},
          "y": {"expr": "scale('y', 32.7)"},
          "size": [{"signal": "width"}, {"signal": "height"}],
          "bandwidth": 30,
          "thresholds": [0.1, 0.2, 0.3, 0.4, 0.5]
        }
      ]
    }
  ],

  "scales": [
    {
      "name": "x",
      "type": "linear",
      "round": true,
      "nice": true,
      "zero": false,
      "domain": {"data": "source", "field": "Horsepower"},
      "range": "width"
    },
    {
      "name": "y",
      "type": "linear",
      "round": true,
      "nice": true,
      "zero": false,
      "domain": {"data": "source", "field": "Miles_per_Gallon"},
      "range": "height"
    },
    {
      "name": "color",
      "type": "sequential",
      "zero": true,
      "domain": {"data": "contours", "field": "value"},
      "range": ["#3ae0ae", "white", "white"]
    }
  ],

  "marks": [
    {
      "type": "path",
      "from": {"data": "contours"},
      "encode": {
        "enter": {
          "stroke": {"value": "#888"},
          "strokeWidth": {"value": 0},
          "fill": {"scale": "color", "field": "value"},
          "fillOpacity": {"value": 0.35}
        }
      },
      "transform": [
        { "type": "geopath", "field": "datum" }
      ]
    },
    {
      "name": "marks",
      "type": "symbol",
      "from": {"data": "source"},
      "encode": {
        "update": {
          "x": {"scale": "x", "field": "Horsepower"},
          "y": {"scale": "y", "field": "Miles_per_Gallon"},
          "size": {"value": 4},
          "fill": [
            {"test": "true", "value": "gray"},
            {"value": "transparent"}
          ]
        }
      }
    }
  ]
}
al6x commented 4 years ago

Maybe a temporary solution could be implemented? In a separate library, like vega-lite-contour? So people would be able to start working with it and it would be easier to find out use cases and how to better integrate it into vega-lite in the future?

Pseudo-code

import { add_contour_plot_curves } from 'vega-lite-contour'
import { vega_lite } from 'vega-lite'

const table = [...] // data table
const table_with_contour = add_contour_plot_curves(table, { ... density estimation options ... })
vega_lite.plot(table, plot_options)
loganpowell commented 4 years ago

🙏