vega / vega-lite

A concise grammar of interactive graphics, built on Vega.
https://vega.github.io/vega-lite/
BSD 3-Clause "New" or "Revised" License
4.68k stars 611 forks source link

Bin is not working with specifying domain #2422

Closed yhoonkim closed 7 years ago

yhoonkim commented 7 years ago

When I specified scale.domain, bin didn't work.

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.json",
  "data": {
    "values": [
      {"x": 0.3349},
      {"x": 0.1216},
      {"x": 0.8341},
      {"x": 0.3341}
    ]
  },
  "mark": "bar",
  "encoding": {
    "x": {
      "field": "x",
      "type": "quantitative",
      "scale": {"domain": [0,1]},
      "bin": {"maxbins": 20}
    },
    "y": {
      "field": "*",
      "type": "quantitative",
      "aggregate": "count"
    }
  }
}

Rendered: image

What I expected (Vega-Lite 1 returns): vega 69

yhoonkim commented 7 years ago

In my guess, the reported Vega-Lite spec should be compiled like this:

{
  "$schema": "http://vega.github.io/schema/vega/v3.0.json",
  "autosize": "pad",
  "padding": 5,
  "data": [
    {
      "name": "source_0",
      "values": [
        {"x": 0.3349},
        {"x": 0.1216},
        {"x": 0.8341},
        {"x": 0.3341}
      ],
      "format": {"type": "json","parse": {"x": "number"}},
      "transform": [
        {
          "type": "filter",
          "expr": "datum[\"x\"] !== null && !isNaN(datum[\"x\"])"
        },
        {
          "type": "extent",
          "field": "x",
          "signal": "bin_maxbins_20_x_extent"
        },
        {
          "type": "bin",
          "field": "x",
          "as": ["bin_maxbins_20_x_start","bin_maxbins_20_x_end"],
          "signal": "bin_maxbins_20_x_bins",
          "maxbins": 20,
          "extent": [0,1] //Currently, vega-lite return this part as {"signal": "bin_maxbins_20_x_extent"}
        },
        {
          "type": "aggregate",
          "groupby": ["bin_maxbins_20_x_start","bin_maxbins_20_x_end"],
          "ops": ["count"],
          "fields": ["*"],
          "as": ["count_*"]
        }
      ]
    }
  ],
  "signals": [
    {"name": "width","update": "200"},
    {"name": "height","update": "200"}
  ],
  "marks": [
    {
      "name": "nested_main_group",
      "type": "group",
      "encode": {
        "update": {
          "width": {"signal": "width"},
          "height": {"signal": "height"},
          "fill": {"value": "transparent"}
        }
      },
      "marks": [
        {
          "name": "marks",
          "type": "rect",
          "role": "bar",
          "from": {"data": "source_0"},
          "encode": {
            "update": {
              "x2": {
                "scale": "x",
                "field": "bin_maxbins_20_x_start",
                "offset": 1
              },
              "x": {"scale": "x","field": "bin_maxbins_20_x_end"},
              "y": {"scale": "y","field": "count_*"},
              "y2": {"scale": "y","value": 0},
              "fill": {"value": "#4c78a8"}
            }
          }
        }
      ]
    }
  ],
  "scales": [
    {
      "name": "x",
      "type": "bin-linear",
      "domain": {
        "signal": "sequence(bin_maxbins_20_x_bins.start, bin_maxbins_20_x_bins.stop + bin_maxbins_20_x_bins.step, bin_maxbins_20_x_bins.step)"
      }, //Currently, vega-lite return this part as [0,1]. 
      "range": [0,200],
      "round": true,
      "nice": true
    },
    {
      "name": "y",
      "type": "linear",
      "domain": {"data": "source_0","field": "count_*"},
      "range": [200,0],
      "round": true,
      "nice": true,
      "zero": true
    }
  ],
  "axes": [
    {
      "scale": "x",
      "format": "s",
      "orient": "bottom",
      "title": "BIN(x)",
      "zindex": 1,
      "encode": {
        "labels": {
          "update": {
            "angle": {"value": 270},
            "align": {"value": "right"},
            "baseline": {"value": "middle"}
          }
        }
      }
    },
    {
      "scale": "y",
      "format": "s",
      "orient": "left",
      "title": "Number of Records",
      "zindex": 1
    },
    {
      "scale": "y",
      "domain": false,
      "format": "s",
      "grid": true,
      "labels": false,
      "orient": "left",
      "ticks": false,
      "zindex": 0,
      "gridScale": "x"
    }
  ]
}

Where there are two changes from the Vega spec derived from the reported Vega-Lite: 1) data[i].transform.extent has changed as the domain that the user specifies in the Vega-Lite spec. 2) scales[0].domain got the proper signal rather than be overridden as [0,1] .

domoritz commented 7 years ago

Good catch. So it sounds like the issue is that we should use the explicit domain as the extent and not the domain. Do you think you can take a stab at this?

yhoonkim commented 7 years ago

@domoritz I'd like to try to solve this issue by myself if you're ok with some delay :) (It will be helpful for me to get used to Vega-Lite repo!)

domoritz commented 7 years ago

Sounds good. Let me know if you have questions.

yhoonkim commented 7 years ago

@domoritz I realized that there is bin.extent for setting domain of the bin. So my problem is solved. However, it is still a problem when scale.domain and bin.extent are specified together.

My suggestion is ignoring scale.domain when the field is binned and throwing a warning message.

domoritz commented 7 years ago

Thanks for looking into it. Yes, that's a great suggestion.

kanitw commented 7 years ago

See #2429

kanitw commented 7 years ago

Looks like this is still incorrect and should be fixed as a part of https://github.com/vega/vega-lite/issues/2791.