wilkelab / ggridges

Ridgeline plots in ggplot2
https://wilkelab.org/ggridges
GNU General Public License v2.0
411 stars 31 forks source link

Error in alternating colors when using "months" as labels with scale_fill_cyclical #33

Closed gponce-ars closed 5 years ago

gponce-ars commented 5 years ago

Alternating colors in geom_density_ridges using months in Y-axis.

Based in the example at Results from Catalan regional elections, 1980-2015 the alternating colors is based on years, so I modified the example to use the month abb and by matching each year with a month abb.

# Goal: To change labels in Y-axis. Instead of using`Years` I want to use `months` for alternating colors for two groups.

# R version 3.5.1 (2018-07-02)
# Platform: x86_64-apple-darwin15.6.0 (64-bit)
# Running under: macOS  10.14.3
# [1] ggplot2_3.1.0     bindrcpp_0.2.2    data.table_1.11.8 ggridges_0.5.1
# [5] forcats_0.4.0     dplyr_0.7.7

library(data.table)
library(dplyr)
library(forcats)
library(ggridges)
library(ggplot2)

# Get "Catalan_elections" dataset as a data.table
dt_Catalan_elections <- as.data.table(Catalan_elections)
# Add a block of data to meet the 12 months and match it with number of unique years.
dt_n <- dt_Catalan_elections[Year==2015,]
dt_n[,Year:=2016]
dt_new <- rbindlist(list(dt_Catalan_elections, dt_n))
dt_new[,Month:=as.character(Year)]
old <- unique(dt_new$Month)
# For each year assign a month (e.g. 1980 - Jan, 1981 - Feb, etc)
dt_new[,month := factor(Month, levels = old, labels = month.abb)]

# Now plot ggridges using `month` instead of Year for Y-axis labels
dt_new %>%
  mutate(MonthFct = fct_rev(as.factor(month))) %>%
  ggplot(aes(y = MonthFct)) +
  geom_density_ridges(
    aes(x = Percent, fill = paste(MonthFct, Option)), 
    alpha = .8, color = "white", from = 0, to = 100
  ) +
  labs(
    x = "Vote (%)",
    y = "Election ",
    title = "Indy vs Unionist vote in Catalan elections",
    subtitle = "Analysis unit: municipalities (n = 949)",
    caption = "Marc Belzunces (@marcbeldata) | Source: Idescat"
  ) +
  scale_y_discrete(expand = c(0.01, 0)) +
  scale_x_continuous(expand = c(0.01, 0)) +
  scale_fill_cyclical(
    breaks = c("Jan Indy", "Jan Unionist"),
    labels = c(`Jan Indy` = "Indy", `Jan Unionist` = "Unionist"),
    values = c("#ff0000", "#0000ff", "#ff8080", "#8080ff"),
    name = "Option", guide = "legend"
  ) +
  theme_ridges(grid = FALSE)

See how the color alternation is not working as expected. Is the color alternation linked to an alpha-numeric sort.

image

Is there a way to use the sequence of months (Jan-Dec) with a correct color alternation in a dataset like that?

clauswilke commented 5 years ago

This usually happens when the variable you're coloring by is in a different order than you think it is. It's unlikely to be a bug, and hence I'll close this issue. You could try to ask for help on stackoverflow.

gponce-ars commented 5 years ago

Thank you, Claus. My mistake. Just in case someone steps into something similar. I'll also post it in this incomplete Stackoverflow post

# Making ggridges work with alternating colors and character labels in the y-axis.
# The key points are:

# 1. Make sure you create the character-column to use as a factor, using levels from the numeric values matching the labels to be used in the y-axis (e.g. month.abb)
# 2. For the aes(y=) call use the factor column just created in the prev. step
# 3. And for the fill use the combination of the numeric value and the grouping variable.

# See the reproducible example based on the Catalan_elections dataset/example.
library(data.table)
library(dplyr)
library(forcats)
library(ggridges)
library(ggplot2)

# Get "Catalan_elections" dataset as a data.table 
dt_Catalan_elections <- as.data.table(Catalan_elections)
# Add a block of data to meet the 12 months and match it with number of unique years.
dt_n <- dt_Catalan_elections[Year==2015,]
dt_n[,Year:=2016]
dt_new <- rbindlist(list(dt_Catalan_elections, dt_n))
old <- as.character(unique(dt_new$Year))
# For each year assign a month (e.g. 1980 - Jan, 1981 - Feb, etc)
dt_new[,month := factor(Year, levels = old, labels = month.abb)]

# get ggridges using month instead of Year for Y-axis labels
p <- dt_new %>%
      ggplot(aes(y = month)) +
      geom_density_ridges(
        aes(x = Percent, fill = paste(Year, Option)), 
        alpha = .8, color = "white", from = 0, to = 100
      ) +
      labs(
        x = "Vote (%)",
        y = "Election",
        title = "Indy vs Unionist vote in Catalan elections",
        subtitle = "Analysis unit: municipalities (n = 949)",
        caption = "Marc Belzunces (@marcbeldata) | Source: Idescat"
      ) +
      scale_y_discrete(expand = c(0.01, 0)) +
      scale_x_continuous(expand = c(0.01, 0)) +
      scale_fill_cyclical(
        breaks = c("1980 Indy", "1980 Unionist"),
        labels = c(`1980 Indy` = "Indy", `1980 Unionist` = "Unionist"),
        values = c("#ff0000", "#0000ff", "#ff8080", "#8080ff"),
        name = "Option", guide = "legend"
      ) +
      theme_ridges(grid = FALSE)
print(p)
image