thackl / gggenomes

A grammar of graphics for comparative genomics
https://thackl.github.io/gggenomes/
Other
572 stars 64 forks source link

Operon boxes #173

Closed waltercostamb closed 6 months ago

waltercostamb commented 6 months ago

Dear developers,

I am visualizing synteny in bacteria. Some genes are inside operons. Is it possible to add an operon visualization? For instance, an empty black box around the genes contained inside the operon?

Best, Maria

thackl commented 6 months ago

Here's how I would approach this. For starters, you can use geom_feat(). The coordinates can come from a simple dataframe.

library(tidyverse)
library(gggenomes)

genes <- tibble(
  seq_id = rep(c("s1", "s2"), each=5), # five genes per sequence
  start = c(50, 250, 500, 1000, 1300, 50, 450, 700, 1200, 1500),
  end = c(150, 450, 950, 1200, 1800, 350, 650, 1150, 1400, 2000) 
)

operons <-  tibble(
  seq_id = c("s1", "s2"),
  start = c(225, 425),
  end = c(1225,1425),
  name= c("operon X", "operon Y"),
)

gggenomes(genes, feats=operons) +
  geom_feat(size=5) +
  geom_feat_note(aes(label=name), nudge_y = -.1) +
  geom_gene()

image

Or you can just as well read them from a file

# write example data to file for demonstation
# bed format: seq_id start end strand score name ...
write_tsv(operons, "operons.bed", col_names=FALSE)

# read a coordinate file (such as .bed, .gff, .gbk)
operons.bed <- read_feats("operons.bed")

gggenomes(genes, feats=operons.bed) +
  geom_feat(size=5) +
  geom_feat_note(aes(label=name), nudge_y = -.1) +
  geom_gene()

image

If you want an actual box, you can use the standart ggplot geoms in gggenomes as well. The only thing in that case is that you need to set x,y and the data track mappings explicitly.

gggenomes(genes, feats=operons) +
  geom_rect(
    aes(xmin=x, xmax=xend, ymin=y+.1, ymax=y+-.1), # explicitly map aesthetics
    data=feats(), # explicitly set track
    fill="cornflowerblue", color="black") +
  geom_feat_note(aes(label=name), nudge_y = -.12) +
  geom_gene()

image

Let me know if that helps of if you had something different in mind. And in case you struggle to implement this with your own data, ideally upload some example data.

waltercostamb commented 6 months ago

Thank you so much for the prompt answer. The first implementation solves my problem. I can easily see the operons now.