vega / vega-lite

A concise grammar of interactive graphics, built on Vega.
https://vega.github.io/vega-lite/
BSD 3-Clause "New" or "Revised" License
4.68k stars 611 forks source link

Adding shape and color to line chart renders two legends #3797

Open d-werner opened 6 years ago

d-werner commented 6 years ago

Hi guys,

I would like to setup a line chart with predefined color and shape scales. But I struggle already with the default scales, because vega-lite renders two different legends. One for each of the two encoding channels.

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.json",
  "description": "A scatterplot showing horsepower and miles per gallons for various cars.",
  "data": {"url": "data/cars.json"},
  "mark": {"type": "line"},
  "encoding": {
    "x": {"field": "Year", "type": "temporal"},
    "y": {
      "aggregate": "mean",
      "field": "Miles_per_Gallon",
      "type": "quantitative"
    },
    "shape": {"field": "Origin", "type": "nominal"},
    "color": {"field": "Origin", "type": "nominal"}
  }
}

bildschirmfoto 2018-05-23 um 11 26 28

It would expect the chart to have a single, combined legend, which is the case for mark type "point".

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.json",
  "description": "A scatterplot showing horsepower and miles per gallons for various cars.",
  "data": {"url": "data/cars.json"},
  "mark": {"type": "point"},
  "encoding": {
    "x": {"field": "Year", "type": "temporal"},
    "y": {
      "aggregate": "mean",
      "field": "Miles_per_Gallon",
      "type": "quantitative"
    },
    "shape": {"field": "Origin", "type": "nominal"},
    "color": {"field": "Origin", "type": "nominal"}
  }
}

bildschirmfoto 2018-05-23 um 11 34 37

This issue might be related to issue F with custom shapes.

BTW, the vega editor shows the warning [Warning] shape dropped as it is incompatible with "line"., although it seems to be valid since (Macro) adding shape to line should automatically add another point layer has been resolved. ;-)

d-werner commented 6 years ago

@jakevdp's solution for an interactive legend would also be a workaround for this issue! :-)

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.json",
  "data": {
    "url": "data/cars.json"
  },
  "hconcat": [
    {
      "mark": "line",
      "encoding": {
        "x": {
          "field": "Year",
          "type": "temporal"
        },
        "y": {
          "aggregate": "mean",
          "field": "Miles_per_Gallon",
          "type": "quantitative"
        },
        "shape": {
          "field": "Origin",
          "type": "nominal",
          "legend": null
        },
        "color": {
          "field": "Origin",
          "type": "nominal",
          "legend": null
        }
      }
    },
    {
      "mark": "point",
      "encoding": {
        "shape": {
          "field": "Origin",
          "aggregate": "min",
          "type": "nominal",
          "legend": null
        },
        "color": {
          "field": "Origin",
          "aggregate": "min",
          "type": "nominal",
          "legend": null          
        },
        "fill": {
          "field": "Origin",
          "aggregate": "min",
          "type": "nominal",
          "legend": null 
        },
        "y": {          
          "field": "Origin",          
          "type": "nominal",
          "title": null
        }
      }
    }
  ]
}

bildschirmfoto 2018-05-23 um 16 49 19

domoritz commented 6 years ago

@d-werner thank you did the bug report. We will look into this. For now, a solution may be to use a layer and only define legends for the point mark layer.

kanitw commented 6 years ago

@domoritz Our logic will first merge scales of the same channel first and then later merge legend of different channels with the same domain.

The problem is that the two layers get different data sources (data_0 and data_1). Since color get merged first, it will now have the data source:

[
  {"data": "data_0", "field": "Origin"},
  {"data": "data_1", "field": "Origin"}
]

while the shape scale still only have one of the data source.

Since the domain are not the same, we can't just merge the two legends.

Thus we would need to 1) Properly merge parse #2177 and identical transforms (#4016) 2) Even if we merge parse and aggregate, the point layer still have invalid value filter. (Meanwhile, lines handle invalid value using a conditional encoding for the channel defined so points can be skipped.)

To handle 2, we may need to change invalid filter to use conditional logic to skip the point by setting "fill" and "stroke" to "none" instead.

d-werner commented 6 years ago

Thanks for your feedback!

Using two layers was my first workaround. But it did not work and rendered two legends, just like @kanitw wrote. Thank you for the explanation!

kanitw commented 6 years ago

Yeah it still runs into the same problem.

g3o2 commented 6 years ago

Since the domain are not the same, we can't just merge the two legends.

Thus we would need to

  1. Properly merge parse #2177 and identical transforms
  2. Even if we merge parse and aggregate, the point layer still have invalid value filter. (Meanwhile, lines handle invalid value using a conditional encoding for the channel defined so points can be skipped.)

@kanitw For the above use case, that is merging data coming from different encoding channels (color and shape), I understand this approach because you might end up with incompatible domains due to filtering and other transforms.

However, when thinking about merging data across layers for any given encoding channel, I feel that enforcing the condition for those layers to share the exact same domain (and thus the same transforms) a restrictive approach. For this second use case, wouldn't it be sufficient and more useful that the layers in question simply shared the same field, field type and the same encoding channel ?

kanitw commented 6 years ago

Checking field and type would work for many cases. (Note that shape and color are not the same encoding channel.)

However, it does not guarantee correctness suppose there are two data sources for two concatenated plots that happen to have same field names. In such case the scale can be totally different.

My solution proposed above guarantees correctness. Refactor to apply fill/stroke = none to invalid values instead of filter is also useful to provide shortcuts to apply different color to points with invalid values.

g3o2 commented 6 years ago

As regards the merging of data between different encoding channels, I fully agree with your proposal.

In the issue 2177 (merging data between layers for the same encoding channel), you've already excluded concatenated plots and other forms of composition from the scope, and rightly so.

Consequently, if only applied to layers, correctness is still guaranteed when only checking field name and type.

utaal commented 6 years ago

I asked about this on Slack yesterday, and only found the issue now. I should stress that this is pretty important for accessibility, as using only color to convey information is problematic for color-blind users. More in general, you may want to consider making it easy to combine color with either shape or the upcoming strokeDash (maybe even a default?). Relevant item in the WCAG guildelines.

hhexiy commented 5 years ago

any update on this issue?

domoritz commented 5 years ago

What @kanitw said in https://github.com/vega/vega-lite/issues/3797#issuecomment-391603239 is still true. We don't have an immediate update yet.

mikeandmore commented 4 years ago

I think the issue still exist. Just want to say that this is a blocker for anyone printing graphs on a printer or viewing them on a e-ink display.

Is it possible to provide an option to combine the legends without these checks? It would be incorrect, but would work around this bug.

AbeHandler commented 1 year ago

Just noting that I am noticing this issue also. I was about to file a bug report issue but found this thread before posting. A fix would be very helpful!

PBI-David commented 6 months ago

Another example on SO.