Closed lcolladotor closed 7 years ago
I think we should talk to Hector Corrada Bravo about getting epivzr to do this - I don't think we should write a function here. I think we should write a tutorial. @lcolladotor
I'm going to see what I can do with https://twitter.com/hcorrada/status/844896126578970625.
Here's what I could get working with epivizr
:
Code at https://gist.github.com/lcolladotor/d02064583733754dea7828a984378c96.
Some notes:
epivizr
window on Google Chrome didn't show the menus well.epivizr
seems to be from hg19. epivizr
.Ahh, I forgot to mention that I couldn't get this to work:
status <- colData(rse)$group
index <- split(seq(along = status), status)
logCounts <- log2(assays(rse)$counts + 1)
means <- sapply(index, function(ind) rowMeans(logCounts[, ind]))
rse_viz <- SummarizedExperiment(means, rowRanges = rowRanges(rse))
mean_plot <- app$plot(rse_viz, datasource_name = 'log 2 mean counts', columns = c('induced', 'uninduced'))
It's based on the epivizr
vignette code.
Hi,
Here is another option, using the UCSC genome browser. Note that the bigwig files in this case are not normalized to the same library size. So the visual comparison is not ideal. It's possible to do so, but that involves making new bigwig files with wiggletools
or similar tools. The upside is that it's easy to change the window range. For example, here is a screenshot after I clicked on "zoom out 10x". Also, the user doesn't have to download any base-pair data from recount2
.
The code is at https://gist.github.com/lcolladotor/c022472ca2fd7199e13fe61d0765096c and I used the options specified at https://ycl6.gitbooks.io/rna-seq-data-analysis/visualization.html for building the tracks file. Notably smoothingWindow=4
.
If there are too many samples in a project, another option is to visualize the mean. We do have mean bigwig files (normalized to the same library size), so comparing data from different projects would be easy with either the UCSC genome browser or with epivizr
.
Best, Leo
I think we should write a tutorial that shows people how to:
(1) obtain bigwigs (2) create a granges object (3) make the base resolution plots we usually do
On Wed, Mar 29, 2017 at 9:18 AM Leonardo Collado-Torres < notifications@github.com> wrote:
Hi,
Here is another option, using the UCSC genome browser. Note that the bigwig files in this case are not normalized to the same library size. So the visual comparison is not ideal. It's possible to do so, but that involves making new bigwig files with wiggletools or similar tools. The upside is that it's easy to change the window range. For example, here is a screenshot after I clicked on "zoom out 10x". Also, the user doesn't have to download any base-pair data from recount2.
[image: screen shot 2017-03-29 at 11 10 54 am] https://cloud.githubusercontent.com/assets/2288213/24462034/f6403a4e-1470-11e7-93f2-2fb7fea44272.png
You can see the data at https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr22%3A50434573-50664482&hgsid=586178827_e4XvaDgOXdWgRLFaaKkGl5DNYnP7 .
The code is at https://gist.github.com/lcolladotor/c022472ca2fd7199e13fe61d0765096c and I used the options specified at https://ycl6.gitbooks.io/rna-seq-data-analysis/visualization.html for building the tracks file. Notably smoothingWindow=4.
If there are too many samples in a project, another option is to visualize the mean. We do have mean bigwig files (normalized to the same library size), so comparing data from different projects would be easy with either the UCSC genome browser or with epivizr.
Best, Leo
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/leekgroup/recount/issues/9#issuecomment-290123579, or mute the thread https://github.com/notifications/unsubscribe-auth/ABf7WtihrQzV2OlO2bZibE8dlbxi254rks5rqnZHgaJpZM4MloUf .
http://bioconductor.org/packages/devel/bioc/html/wiggleplotr.html looks interesting (noticed via https://twitter.com/seandavis12/status/847628575339536386). From http://bioconductor.org/packages/devel/bioc/vignettes/wiggleplotr/inst/doc/wiggleplotr.html it looks like you can specify a scaling factor.
Thanks @lcolladotor for linking me to this conversation. I think it would be a great application for wiggleplotr. Internally, wiggleplotr uses rtracklayer::BigWigSelection and rtracklayer::import.bw functions to import read coverage from bigWig files (see the readCoverageFromBigWig function in here: https://github.com/kauralasoo/wiggleplotr/blob/master/R/functions.R). If rtracklayer support remote bigBig files then they should also work with wiggleplotr.
Let me know if you have any questions regarding the usage of wiggleplotr and I'll be happy to help. I'm travelling until Tuesday but will have more time after that.
The tutorial using derfinderPlot
is part of https://github.com/LieberInstitute/recountWorkflow. I mentioned wiggleplotr and epivizR as other options. Showing how to use all the different tools could be a workflow itself, but for now that is out of the scope of the recount workflow I wrote. In any case, the workflow addresses this issue, so I'll close it for now.
I've received feedback from users interested in being able to plot the base-level coverage for a set of samples in recount2.
One way of doing this is via
derfinderPlot::plotRegionCoverage()
. I could write a function where the user specifies:run
for SRA and GTEx,gdc_file_id
for TCGA),Then internally run
derfinder::loadCoverage()
accessing the data from either local bigwigs (if at JHPCE, SciServer or if the user downloaded them) or via the web. Something that could be a bit of a bottle neck is the annotation needed forderfinderPlot::plotRegionCoverage()
. That would normally involve creating a genomic state object from Gencode v25 hg38 which takes time even for one chromosome. I could pre-make them split by chr and have the function download the files from the web behind the scenes. In any case, right now this approach would have to deal with https://github.com/rafalab/bumphunter/issues/15 sincederfinderPlot::plotRegionCoverage()
requires the output ofbumphunter::matchGenes()
.Another option would be to open the data in the UCSC genome browser or something like that.
A third option could be to use
epivizr
http://bioconductor.org/packages/release/bioc/vignettes/epivizr/inst/doc/IntroToEpivizr.html. Although I'm not sure that it can read bigwig files from the web.Thoughts @andrewejaffe @jtleek ?
Right now, users can get the list of URLs from
recount::recount_url
and load them into IGV or similar programs.