vega / vega-lite

A concise grammar of interactive graphics, built on Vega.
https://vega.github.io/vega-lite/
BSD 3-Clause "New" or "Revised" License
4.68k stars 611 forks source link

Correctly Cross Nested Facet #2775

Open kanitw opened 7 years ago

kanitw commented 7 years ago

Just like how we have scale resolution.

When Facet is nested, we need to resolve domain of inner facet and impute additional facet cells accordingly.

For example, given a spec:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.json",
  "data": {"url": "data/cars.json"},
  "facet": {"column": {"field": "Origin","type": "nominal"}},
  "spec": {
    "facet": {
      "column": {"field": "Cylinders","type": "ordinal"},
      "row": {"bin": {"maxbins": 4}, "field": "Displacement","type": "quantitative"}
    },
    "spec": {
      "mark": "point",
      "encoding": {
        "x": {"field": "Horsepower","type": "quantitative"},
        "y": {"field": "Acceleration","type": "quantitative"}
      }
    }
  }
}

by default we would expect the table to be

( Origin \ Cylinder ) x Bin(Displacement)

which means Bin(Displacement) is crossed against Origin.

However, what we currently generate is Origin \ (Cylinder x Bin(Displacement)) so Bin(Displacement) is nested under Origin.

Thus, the output nested facet "table" is imbalanced:

vega_editor

We need a few things to make this correct

jheer commented 7 years ago

Regarding "a mechanism for imputing Vega's facet operator to fill the domain": is it sufficient to use the cross option, which you can specify under facet.aggregate? This will impute missing cells.

Or, do you want to distribute this across multiple facet operations, and be able to provide a set of domain values to parameterize the imputation?

kanitw commented 7 years ago

This is a simpler example: https://pastebin.com/AaC9DFUH (I still set columns: 3 as a hack as I can't get vega-editor to correctly get the latest Vega-View yet.)

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.json",
  "data": {"url": "data/cars.json"},
  "facet": {"column": {"field": "Origin","type": "nominal"}},
  "spec": {
    "facet": {
      "column": {"field": "Cylinders","type": "ordinal"},
      "row": {"bin": {"maxbins": 4}, "field": "Displacement","type": "quantitative"}
    },
    "spec": {
      "mark": "point",
      "encoding": {
      }
    }
  }
}

vega_editor

In this figure , inside the Europe and Japan cell, although we already turn cross: true for them, they do not know that there should be 4 bins for bin_displacement.

If we have data sources wiring up correctly, we would have a child_row_domain data source that have 4 bins for bin_displacement and ideally we should be able to wire up this row_domain data source to the facet's "impute" directive to impute bin_displacement with all domain values aggregating+crossing.

(I think this is a bit hairy so I prefer that we do not support this in VL 2.0 -- but I wanna note down so we can come back to it in the future.)

A related question is whether we wanna allow facet of facet at all.
(Nested Facet without crossing is working fine but nested facet with crossing is not as described above.)

bigfacewo commented 2 years ago

Hi. This issue has not been closed. I would like to know how the issue is progressing. Thank you.