PhanstielLab / Sushi

Tools for visualizing genomics data
67 stars 19 forks source link

Missing genes on exome level zoom #11

Open mmoisse opened 6 years ago

mmoisse commented 6 years ago

We you want to zoom into a gene at exon level you don't see the exon anymore. As shown by the following figure: org

I believe this has to do with a transcriptinfo filtering step in plotGenes:

transcriptinfo = transcriptinfo[which(
    (transcriptinfo[, 2] > chromstart & transcriptinfo[, 2] < chromend) | 
    (transcriptinfo[, 3] > chromstart & transcriptinfo[, 3] < chromend)
), ]

I believe you should add the situation where the zoom window falls completely inside a gene:

transcriptinfo = transcriptinfo[which(
    (transcriptinfo[, 2] > chromstart & transcriptinfo[, 2] < chromend) | 
    (transcriptinfo[, 3] > chromstart & transcriptinfo[, 3] < chromend) |
    (transcriptinfo[, 2] <= chromstart & transcriptinfo[, 3] >= chromend)
), ]

Which gives the figure I expected: new

Here is the code to reproduce the plot

library(Sushi)
## Get data
Sushi_data = data(package = 'Sushi')
data(list = Sushi_data$results[,3])

## Setup position
chrom = "chr11"
chromstart = 1945000
chromend = 1960000

## Get genes
mart = useMart(biomart = "ensembl", dataset = "hsapiens_gene_ensembl")
geneinfo = getBM(attributes = c("chromosome_name", "exon_chrom_start","exon_chrom_end", "external_gene_name", "strand"), filters = c("chromosome_name", "start", "end"), values = list(gsub("chr","",chrom), chromstart, chromend), mart = mart)
names(geneinfo) = c("chrom", "start", "stop", "gene","strand")
geneinfo$score = "."
geneinfo = geneinfo[, c(1, 2, 3, 4, 6, 5)]
geneinfo$chrom = paste0("chr", sub("chr","",geneinfo$chrom))

## Set graphic parameters
par(mar=c(3,4,1,1))
layout(matrix(c(1,2,3,4), 4, 1, byrow = TRUE))

## Plot coverage
plotBedgraph(Sushi_ChIPSeq_CTCF.bedgraph,
   chrom,chromstart,chromend, 
   transparency=.50,color=SushiColors(2)(2)[1]
)                                                                                                                                                                                       
axis(side=2,las=2,tcl=.2)
labelgenome(chrom,chromstart,chromend, n=5,scale="bp")

## Plot gene
plotGenes(geneinfo, chrom,chromstart,chromend)

## Create zoom box
chrom = "chr11"
chromstart = 1956000
chromend = 1958000
zoomsregion(c(chromstart,chromend), extend=c(0.01,0.13), wideextend=0.05, offsets=c(0,0))

## Plot coverage
plotBedgraph(Sushi_ChIPSeq_CTCF.bedgraph,
   chrom,chromstart,chromend, 
   transparency=.50,color=SushiColors(2)(2)[1]
)                                                                                                                                                                                       
axis(side=2,las=2,tcl=.2)
labelgenome(chrom,chromstart,chromend, n=5,scale="bp")

## Plot gene
plotGenes(geneinfo, chrom,chromstart,chromend)
lucapandolfini commented 4 years ago

In my opinion the test

transcriptinfo = transcriptinfo[which((transcriptinfo[, 2] > 
        chromstart & transcriptinfo[, 2] < chromend) | (transcriptinfo[, 
        3] > chromstart & transcriptinfo[, 3] < chromend)), ]

should be modified into:

transcriptinfo = transcriptinfo[which((transcriptinfo[, 2] >=
        chromstart & transcriptinfo[, 2] <= chromend) | (transcriptinfo[, 
        3] >= chromstart & transcriptinfo[, 3] <= chromend)), ]

in order to account for transcripts starting exactly at the chromstart AND ending exactly at chromend. Optionally, it would be nice the part of a certain transcript spanning the chromstart/chromend selection were visualized instead of dropping it.