thackl / gggenomes

A grammar of graphics for comparative genomics
https://thackl.github.io/gggenomes/
Other
606 stars 65 forks source link

Zoom sequence #111

Closed hkaspersen closed 2 years ago

hkaspersen commented 2 years ago

Hello, and thank you for developing this brilliant package!

I was wondering if it was possible to zoom into a specific region of interest, based on either the genetic neighborhood around a gene by name, or simply xlim?

Thanks in advance!

thackl commented 2 years ago

Hi Håkon,

yes, you can! Have a look at focus() and see below for a few simple examples

library(tidyverse)
library(gggenomes)

# from https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/009/738/455/GCA_009738455.1_ASM973845v1/
ecoli_genes <- read_feats("GCA_009738455.1_ASM973845v1_genomic.gbff")

# all genes are too many for useful plot
p <- gggenomes(ecoli_genes) +
  geom_seq() +
  geom_gene() + geom_gene_tag(aes(label=name))

# zoom in based on gene name
p %>% focus(name == "metH")

image

p %>% focus(name == "metH", .expand = c(1e4, 3e4))

image

# works with multiple hits - each locus becomes a new seq
p %>% focus(str_detect(name, "nag")) 
# see .max_dist to control whether close hits are considered to be part of the
# same locus, and .locus_score/.locus_filter to further filter which of those
# loci to show

image

# make each locus its own bin
p %>% focus(str_detect(name, "nag"), .locus_bin="locus") 

image

# zoom in by coordinate table
zoom_to <- tibble(seq_id="CP046527", start=2150000, end=2200000)
p %>% focus(.loci = zoom_to)

image

hkaspersen commented 2 years ago

Thanks, this worked perfectly!