clauswilke / dataviz

A book covering the fundamentals of data visualization
https://clauswilke.com/dataviz
Other
3.2k stars 701 forks source link

Overlapping dots in dotplot #88

Closed nielsswinkels closed 5 years ago

nielsswinkels commented 5 years ago

Not sure if this is a bug, and not sure where to fix it, so I'm posting it as an issue instead:

In this image, the dots of the dotplot slightly overlap on some places. My guess is that that is not supposed to happen? https://serialmentor.com/dataviz/visualizing_uncertainty_files/figure-html/election-quantile-dot-1.png

From chapter 16.1: https://serialmentor.com/dataviz/visualizing-uncertainty.html

ptoche commented 5 years ago

Claus will be able to tell you, I'm sure, but it looks like you need to set a greater width. If you produce the plot in RStudio (using the code below) and click on Zoom you'll notice that as you resize the window the dots never overlap along the vertical dimension, but if you set too narrow a width they can overlap along the horizontal dimension.

Edit: My guess is you would need to set width parameters in the source file (rmarkdown). Something like this (exagerating the width):

```{r, fig.width=10, fig.height=2.5}
## etc.

If you wanted to export the plot below with ggsave, you could set an appropriate width inside the ggsave call.

Copying some of the code for convenience:

library(ggplot2)
library(dplyr)

mu <- 1.02
sd <- 0.9
binwidth <- 0.31
binwidth <- 0.29
x <- c(seq(-2.5, 0, length.out = 50), seq(0.00001, 5, length.out = 100))
df_norm <- data.frame(
  x,
  y = dnorm(x, mu, sd),
  type = ifelse(x <= 0, "A", "B")
)
df_q <- data.frame(x = qnorm(ppoints(50), mu, sd)) %>%
  mutate(type = ifelse(x <= 0, "A", "B"))
p1 <- ggplot(df_q, aes(x, fill = type)) +
  geom_vline(xintercept = 0, linetype = 2, color = "gray50") +
  geom_line(data = df_norm, aes(x, 1.92*y)) + # factor 1.92 manually determined
  geom_dotplot(binwidth = binwidth) +
  scale_x_continuous(
    name = NULL, #"percent point advantage for blue",
    labels = scales::percent_format(accuracy = 0.1, scale = 1)
  ) +
  scale_y_continuous(
    name = NULL,
    breaks = NULL,
    expand = c(0, 0),
    limits = c(0, 0.9)
  ) +
  scale_fill_manual(
    values = c(A = "#f8f1a9", B = "#b1daf4"),
    guide = "none"
  ) 
clauswilke commented 5 years ago

It's a property of quantile dotplots that can't always be avoided. It depends on how well a distribution can be represented by equally sized circles. I have reviewed the figure and have decided to leave it as is.