clauswilke / dataviz

A book covering the fundamentals of data visualization
https://clauswilke.com/dataviz
Other
3.2k stars 701 forks source link

Types of bad #35

Closed steveharoz closed 6 years ago

steveharoz commented 6 years ago

(continuing a discussion from twitter)

While I like that visualizations are labeled as bad or ugly, it'd be informative to make those designations more consistent and clear.

Here are possible categories:

  1. Wrong - The wrong information is shown on the screen (e.g., log scaled axis where the label also says that it's log - making it double log)
  2. Deceiving - The information may be misperceived unless you pay careful attention (e.g., small multiples with different y-axes)
  3. Imprecise - Not necessarily the wrong information but may not be good for reading/comparing individual values (e.g. pie charts with many slices or stacked bar charts)
  4. Not optimal - Some tasks may be difficult (e.g., difficult to find stuff with out of order bars)
  5. Ugly - Claus doesn't like it (e.g., angled x-axis text)

You'll probably want to combine some of those categories for simplicity.

What's tough is that a lot of these depend on which information a person wants. Stacked bars are imprecise for individual comparison, but do well for comparing the total size of the stack to another stack.

clauswilke commented 6 years ago

Thanks for spelling out these categories. I’ll have to ponder this some more. I think it would be best to have at most 3 different categories in the final book.

In any case, my first priority over the next few weeks is to write drafts of the remaining chapters. Once I have those, I can look over all figures and see what categories make the most sense. It probably isn’t useful to add a category that would be used only once or twice in the entire book.

clauswilke commented 6 years ago

@steveharoz I've thought this over and attempted to visualize my thinking with an example. Would be interested to hear your perspective.

I think there are broadly three distinct aspects to a visualization: (i) the math; (ii) human perception; (iii) aesthetics/design. If something with the math is wrong, the plot is simply incorrect. That's your point 1. Your points 2 and 3 (and maybe 4) are related to perception. Even if the math is correct, humans may have difficult seeing the plot correctly. Finally, a plot may simply have poorly chosen design, even if the information is clearly presented. This leads me to three categories, "wrong", "bad", and "ugly". The boundary between "bad" and "ugly" is somewhat fluid, and in borderline cases it may be better to go with "ugly" than with "bad" to express that there's an element of personal choice.

I've made a figure that tries to explain these three categories by showing different variants of the same figure:

screen shot 2018-07-25 at 2 29 40 pm

I'm not entirely sure I like the "bad" case here. Now that I type it, maybe a better example would be one where each bar has its own y axis with a different axis range.

clauswilke commented 6 years ago

Alternative version where Part (c) has different y-axis scales.

screen shot 2018-07-25 at 2 54 19 pm
steveharoz commented 6 years ago

Looks good! The alternative seems more clear to me.

I agree that the "bad" category is a bit fluid. But it's fine to have catchall where careful attention is required to overcome low discriminability, to avoid missing something, or to avoid drawing the wrong conclusion.

Since the "bad" category will inevitably have a lot of borderline cases, maybe "caution" is a better term? If you're not careful, you may misinterpret differences/similarities, miss important information, or just be slow at simple tasks. Also "caution" allows for case where a visualization works well or performs poorly depending on its use. (pardon the self citation for that last point - http://steveharoz.com/research/attention/papers/Haroz_Whitney_2012_InfoVis.pdf)

clauswilke commented 6 years ago

I like the idea of "caution" but not the word. I think the words need to be all adjectives (or alternatively, all nouns). I could use "deceiving" but it doesn't have the same punch as "bad".

steveharoz commented 6 years ago

hazardous unreliable sketchy risky

clauswilke commented 6 years ago

Thanks!

"ugly", "sketchy", "bad" seems a good progression.

@mikeloukides, any comments?

clauswilke commented 6 years ago
screen shot 2018-07-25 at 4 09 01 pm
mikeloukides commented 6 years ago

Sorry--I've been buying a new car. (Nothing exciting.)

Anyway--I agree with the principle of having 3 categories. Anything more will be confusing. That's (part of) why 3 shows up as a "magic number" in a lot of stories.

"Sketchy" doesn't sound like the right word... is that intermediate between "ugly" and "sketchy"?

With my background in documentation, I'd stay away from words like "caution" and "warning." They actually have fairly precisely defined semantics for at least some people in our audience. Probably only a small minority, frankly, but it's there. ("Warning" means there's a possibility of damage to health or life; "caution" means there's a possibility of damage to equipment or data.) "Hazardous" probably takes you into the same area.

You could "deceptive" rather than "deceiving"; though I agree that still has less punch than "bad." But your intent is to indicate that the visualization is misleading, "deceptive" is very clear and descriptive.

Mike Loukides VP, Content Strategy O'Reilly Media, Inc.

On Wed, Jul 25, 2018 at 10:05 AM, Claus Wilke notifications@github.com wrote:

Thanks!

"ugly", "sketchy", "bad" seems a good progression.

@mikeloukides https://github.com/mikeloukides, any comments?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/clauswilke/dataviz/issues/35#issuecomment-407765995, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZPVqAmXeOSOv7Q_Y9VAFDUWjF27Mt1ks5uKHtDgaJpZM4VYvUr .

clauswilke commented 6 years ago

Mike, thanks for your input. I think "deceptive" is too precise, because some of the figures that would fit into this bin are confusing or unclear rather than deceptive (e.g., a figure with too many different colors, or with colors that are hard to distinguish by colorblind people). I guess we're back at "ugly", "bad", "wrong". These words are less specific, so I have more room to define them appropriately.

mikeloukides commented 6 years ago

That's fine with me.

Mike Loukides VP, Content Strategy O'Reilly Media, Inc.

On Wed, Jul 25, 2018 at 4:35 PM, Claus Wilke notifications@github.com wrote:

Mike, thanks for your input. I think "deceptive" is too precise, because some of the figures that would fit into this bin are confusing or unclear rather than deceptive (e.g., a figure with too many different colors, or with colors that are hard to distinguish by colorblind people). I guess we're back at "ugly", "bad", "wrong". These words are less specific, so I have more room to define them appropriately.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/clauswilke/dataviz/issues/35#issuecomment-407886912, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZPVvbcFm36koppLvvdbm4YGc4IBjScks5uKNapgaJpZM4VYvUr .

steveharoz commented 6 years ago

What was the problem with sketchy? bad seems inappropriately judgmental.

mikeloukides commented 6 years ago

Sketchy can mean a lot of different things, especially in this context. Drawn badly (like a sketch), possibly dishonest (which is what you're going for), not trustworthy (which is a similar idea, but we're talking about the presentation of the data, not the data itself--and trustworthy, at least to me, seems to refer to the data).

Of course, 'bad' can also mean a lot of things.

But I also don't have a huge problem with being judgemental.

Mike Loukides VP, Content Strategy O'Reilly Media, Inc.

On Thu, Jul 26, 2018 at 10:37 AM, Steve Haroz notifications@github.com wrote:

What was the problem with sketchy? bad seems inappropriately judgmental.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/clauswilke/dataviz/issues/35#issuecomment-408119440, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZPVozpjvD0pW7glb11viCcwZJELLI9ks5uKdQlgaJpZM4VYvUr .

clauswilke commented 6 years ago

I'm closing this issue. It is addressed in the latest build available online.

@steveharoz, if you have comments on specific figures please open separate issues.