thackl / gggenomes

A grammar of graphics for comparative genomics
https://thackl.github.io/gggenomes/
Other
579 stars 64 forks source link

Mapping metadata to label #114

Closed hkaspersen closed 2 years ago

hkaspersen commented 2 years ago

Hello! I am having some troubles with adding metadata to the figures made by gggenomes. I would like to do something like this:

gggenomes(seqs = is_seqs, genes = all_is_gff, links = is_aln_data) +
  geom_seq() +
  geom_bin_label(data = metadata,
                 mapping = aes(label = country))

Where I swap out the bin label to a variable I have in some other data frame (or if it can be mapped from one of the dataframes needed for gggenomes). Is this possible in any way?

thackl commented 2 years ago

Hi Håkon,

good question. It is definitely possible but as of now probably not as easy as it should be. geom_bin_label uses data derived from the sequence table. In theory you can make any data from seqs available to geom_bin_label.

s0 <- tibble(
  seq_id = c("a1", "a2", "b1"),
  bin_id = c("A", "A", "B"),
  length = c(1000, 2000, 3000),
  country = c("Norway", "Norway", "Sweden")
)

p1 <- gggenomes(seqs=s0) + geom_seq()
p1 + geom_bin_label()

image

Add additional variables from seqs to bins - granted hard to guess unless you know.

p1 + geom_bin_label(aes(label=country)) # doesn't work b/c 
p1 %>% pull_bins() # country is not present in internal bin table
p1 %>% pull_bins(.group=vars(country)) # but it can be included
p1 + geom_bin_label(aes(label=country), data=bins(.group=vars(country)))

image

It probably would be good to implement the following in future versions. This should be closest to what you suggested. Feel free to comment!

For now, you can just copy and paste the code to make it work for you as well.

# something like
bins_with_meta <- function (..., .group = vars(), .meta = NULL){
    dots <- quos(...)
    function(.x, ...) {
        b <- pull_bins(.x, !!!dots, .group = .group)
        if(!is.null(.meta)){
          b <- left_join(b, .meta)
        }
        b
    }
}

meta <- tibble(
  bin_id = c("A", "B"),
  country = c("Norway", "Sweden"),
  year=2022)

gggenomes(seqs=s0) + geom_seq() +
  geom_bin_label(
    aes(label=paste(bin_id, "\n", country, year)),
    data=bins_with_meta(.meta = meta), expand_left = 0.4)

image

hkaspersen commented 2 years ago

Thanks, this works well! I would also like to know how to change the font of the text in the geom_bin_label, as it does not seem to be affected by theme( text = element_text(family = "Open Sans"))?

thackl commented 2 years ago

Interesting. I would have thought this would work. Internally geom_bin_label just calls geom_text. I'll need to look into it...

thackl commented 2 years ago

Ok, so changing the font is actually even easier: geom_bin_label(family = "Open Sans"). I don't know why/how theme->text does not work, but it also does not work with a regular ggplot+geom_text. theme->text only seems to affect axis labels etc, i.e. items related to the plot theme, not the geoms...