thackl / gggenomes

A grammar of graphics for comparative genomics
https://thackl.github.io/gggenomes/
Other
606 stars 65 forks source link

Error using continuous fill of geom_link() at same time as discrete fill of geom_gene() #159

Closed lukesarre closed 1 year ago

lukesarre commented 1 year ago

Hello, Thank you for developing this excellent package. I'm in the process of developing some figures, and it would be very useful for me to be able to use categorical variables to determine the fill of genes, and continuous variables to determine the fill of the links. This may be functionality that is not supported, but it may be that I am making a silly error! Here is some code to reproduce what I am experiencing:

seqs <- data.frame(seq_id = c("chr_A", "chr_B"),
                   seq_desc = NA,
                   length = 1000)

genes <- data.frame(seq_id = c("chr_A", "chr_A", "chr_B", "chr_B"),
                    type = "CDS",
                    start = c(200, 700, 100, 500),
                    end = c(400, 900, 400, 750),
                    feat_id = c("gene_1", "gene_2", "gene_3", "gene_4"),
                    cluster_id = c("cluster_A", "cluster_B", "cluster_A", "cluster_B"))

orthogroups <- data.frame(cluster_id = c("cluster_A", "cluster_B", "cluster_A", "cluster_B"),
                          feat_id = c("gene_1", "gene_2", "gene_3", "gene_4"),
                          omega = c(0.25, 5.7, 0.25, 5.7))

testPlot <- gggenomes(genes = genes, seqs = seqs) + 
  geom_seq() + 
  geom_bin_label()

testPlot <- testPlot %>% add_clusters(orthogroups)
testPlot$data$links$orthogroups <- testPlot$data$links$orthogroups %>% left_join(orthogroups, by = "feat_id")

#Check the plot is working:
testPlot + geom_gene()

#Check that I can color clusters by a discrete variable:
testPlot + geom_gene(aes(fill=cluster_id))

#Check that the links are working:
testPlot + geom_gene() + geom_link()

#Check that I can color clusters by a discrete variable, with links:
testPlot + geom_gene(aes(fill=cluster_id)) + geom_link()

#Check that I can colour the links by a continuous variable:
testPlot + geom_gene() + geom_link(aes(fill=omega))

#Check that I can colour the links by a continuous variable, and clusters by a discrete:
testPlot + geom_gene(aes(fill=cluster_id)) + geom_link(aes(fill=omega)) #This gives Error: Continuous value supplied to discrete scale
testPlot + geom_link(aes(fill=omega)) + geom_gene(aes(fill=cluster_id)) #This gives Error: Discrete value supplied to continuous scale

Kind regards,

Luke

iimog commented 1 year ago

Hi Luke,

excellent question! By default, ggplot does not allow multiple scales for the same thing (e.g. color or fill). I had the same problem in the past and used some hacky workarounds. Luckily, I just discovered, that there is a far better alternative now. You can use the ggnewscale package:

install.packages("ggnewscale") # if not already installed
library(ggnewscale)
testPlot + geom_link(aes(fill=omega)) + new_scale_fill() + geom_gene(aes(fill=cluster_id))

ggg_with_newscale

lukesarre commented 1 year ago

Awesome, thank you Luke