tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.47k stars 2.02k forks source link

scale_color_identity - additional NAs in the figure legend #5775

Closed totajuliusd closed 6 months ago

totajuliusd commented 6 months ago

I found a problem when using the scale_color_identity function with version 3.5 of ggplot2

With previous versions of ggplot2 the plots I have generated have looked like this (see the legend)

image

Now it looks like this:

image

Here is the code to reproduce the latter plot:

library(topr)
m1 <- manhattan(list(CD_UKBB,CD_FINNGEN), show_legend = F)
m1+scale_color_identity(guide = "legend",  name="data", c("darkblue","#E69F00"), labels=c("dat1","dat2"))

How can I fix this? Any help with this is much appreciated!

teunbrand commented 6 months ago

Hi there, thanks for the report. Can you chop down the example to a more direct reproducible example, preferably with as little additional packages involved? At this point it is unclear whether the manhattan() function is doing something wrong, or ggplot2 is.

totajuliusd commented 6 months ago

Hi and thank you for the quick reply. I have chopped down the example, and it looks like there are two problems.

1) NA gets added to the figure legend if I add text to the plot with ggrepel 2) If I use two different shades of a color in my input dataset, I get a figure legend for both colors, although I specifically ask for just two legend labels (one per dataset) using scale_color_identity (this worked fine in previous ggplot2 versions).

Start by creating two dataframes, df1 and df2, for testing

df1 <-  data.frame(P=floor(runif(100, min=0, max=100)), POS=c(1:100))
df2 <-  data.frame(P=floor(runif(100, min=0, max=100)), POS=c(1:100))

Problem 1

p1 <- ggplot()+geom_point(data=df1, aes(x=POS, y=P, color="darkblue"))+ geom_point(data=df2, aes(x=POS, y=P, color="#E69F00"))
p1 <- p1 + scale_color_identity(guide = "legend",  name="data", c("darkblue","#E69F00"), labels=c("dat1","dat2"))
# after adding a label with ggrepel I get an extra NA label in the figure legend 
p1+ ggrepel::geom_text_repel(aes(x=2, y=0, label="Text",color="red"))

Problem 2

# add different shades of the same color
df1$color <- ifelse(df1$POS %% 10 == 0, "darkblue", "#9999D1")
df2$color <- ifelse(df1$POS %% 10 == 0, "#E69F00", "#F5D999")

p1 <- ggplot()+geom_point(data=df1, aes(x=POS, y=P, color=color))+ geom_point(data=df2, aes(x=POS, y=P, color=color))+theme_bw()
# when I add the legend, I get legend labels for all 4 colors, instead of just the two as I used to get in previous ggplot2 versions
p1<- p1+scale_color_identity(guide = "legend",  name="data", c("darkblue","#E69F00"), labels=c("dat1","dat2"))

# and then again if I add a label to the plot, I get an extra NA label in the figure legend
p1+ggrepel::geom_text_repel(aes(x=2, y=0, label="Text",color="red"))
teunbrand commented 6 months ago

Thanks for configuring this issue to work with ggplot2 alone, that makes it much easier to see what is going on for us. I think both problems stem from the unnamed argument c("darkblue","#E69F00"). scale_color_identity() forwards this to discrete_scale() and ends up as the scale_name argument. This is a deprecated argument, so it doesn't do anything.

If you use it like limits = c("darkblue","#E69F00"), both problems appear to go away. I don't think this is ggplot2 doing anything wrong, it is generally advised to give named arguments whenever you're funneling something through the ... argument.

totajuliusd commented 6 months ago

That solves it! Thank you very much for your prompt help!