estepi / ASpli

Analysis of alternative splicing using RNAseq
7 stars 1 forks source link

exportIntegratedSignals returns empty Gene View output #17

Open Estherrr1997 opened 3 years ago

Estherrr1997 commented 3 years ago

Hi,

I have followed the ASpli tutorial from: https://rdrr.io/bioc/ASpli/src/inst/doc/ASpli.R and ran the commands (the ones that weren't commented out), although, when I try to visualize the alternative splicing events using exportIntegratedSignals, it returns an html page in which I can click on the "View" buttons; this returns only a small black block and not the output as described in figure 8 in: https://www.bioconductor.org/packages/devel/bioc/vignettes/ASpli/inst/doc/ASpli.pdf Is there a way to fix this?

Also, do the bam files come from the ASpli package or do I have to download them?

Kind regards, Esther

liminghao663 commented 3 years ago

same question

estepi commented 3 years ago

Hi all, Im sorry for the delay. Can you paste your R session? In which OS are you working? thanks !

liminghao663 commented 3 years ago

Thank you for your reply. I run it in ubuntu 20 I tried with my own data and the pipe runs well without error. But in the function of exportIntegratedSignals, no plot was generated.

sessionInfo() R version 4.0.3 (2020-10-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.2 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale: [1] LC_CTYPE=zh_CN.UTF-8 LC_NUMERIC=C LC_TIME=zh_CN.UTF-8 LC_COLLATE=zh_CN.UTF-8 LC_MONETARY=zh_CN.UTF-8
[6] LC_MESSAGES=zh_CN.UTF-8 LC_PAPER=zh_CN.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] DT_0.17 GenomicFeatures_1.42.1 ASpli_2.0.0 edgeR_3.32.1
[5] limma_3.46.0 ggplot2_3.3.3 pheatmap_1.0.12 doParallel_1.0.16
[9] iterators_1.0.13 foreach_1.5.1 NMF_0.23.0 cluster_2.1.0
[13] rngtools_1.5 pkgmaker_0.32.2 registry_0.5-1 BSgenome.Hsapiens.UCSC.hg38_1.4.3 [17] BSgenome_1.58.0 rtracklayer_1.50.0 Biostrings_2.58.0 XVector_0.30.0
[21] GenomicRanges_1.42.0 GenomeInfoDb_1.26.2 maftools_2.6.05 BiocParallel_1.24.1
[25] GSEABase_1.52.1 graph_1.68.0 annotate_1.68.0 XML_3.99-0.5
[29] AnnotationDbi_1.52.0 IRanges_2.24.1 S4Vectors_0.28.1 Biobase_2.50.0
[33] BiocGenerics_0.36.0 GSVA_1.38.2

loaded via a namespace (and not attached): [1] backports_1.2.1 Hmisc_4.5-0 BiocFileCache_1.14.0 plyr_1.8.6 igraph_1.2.6
[6] lazyeval_0.2.2 splines_4.0.3 crosstalk_1.1.1 gridBase_0.4-7 digest_0.6.27
[11] ensembldb_2.14.0 htmltools_0.5.1.1 checkmate_2.0.0 magrittr_2.0.1 memoise_2.0.0
[16] remotes_2.2.0 matrixStats_0.58.0 vroom_1.4.0 askpass_1.1 prettyunits_1.1.1
[21] jpeg_0.1-8.1 colorspace_2.0-0 blob_1.2.1 rappdirs_0.3.3 xfun_0.21
[26] dplyr_1.0.4 jsonlite_1.7.2 crayon_1.4.1 RCurl_1.98-1.2 VariantAnnotation_1.36.0
[31] survival_3.2-7 glue_1.4.2 gtable_0.3.0 zlibbioc_1.36.0 UpSetR_1.4.0
[36] DelayedArray_0.16.2 scales_1.1.1 DBI_1.1.1 Rcpp_1.0.6 htmlTable_2.1.0
[41] xtable_1.8-4 progress_1.2.2 foreign_0.8-81 bit_4.0.4 Formula_1.2-4
[46] htmlwidgets_1.5.3 httr_1.4.2 RColorBrewer_1.1-2 ellipsis_0.3.1 pkgconfig_2.0.3
[51] farver_2.1.0 Gviz_1.34.0 nnet_7.3-15 dbplyr_2.1.0 locfit_1.5-9.4
[56] tidyselect_1.1.0 labeling_0.4.2 rlang_0.4.10 reshape2_1.4.4 munsell_0.5.0
[61] tools_4.0.3 cachem_1.0.4 generics_0.1.0 RSQLite_2.2.3 evaluate_0.14
[66] stringr_1.4.0 fastmap_1.1.0 yaml_2.2.1 knitr_1.31 bit64_4.0.5
[71] purrr_0.3.4 AnnotationFilter_1.14.0 xml2_1.3.2 biomaRt_2.46.3 BiocStyle_2.18.1
[76] compiler_4.0.3 rstudioapi_0.13 curl_4.3 png_0.1-7 statmod_1.4.35
[81] tibble_3.1.0 stringi_1.5.3 lattice_0.20-41 ProtGenerics_1.22.0 Matrix_1.3-2
[86] vctrs_0.3.6 pillar_1.5.0 lifecycle_1.0.0 BiocManager_1.30.10 data.table_1.14.0
[91] bitops_1.0-6 R6_2.5.0 latticeExtra_0.6-29 gridExtra_2.3 codetools_0.2-18
[96] dichromat_2.0-0 MASS_7.3-53 assertthat_0.2.1 SummarizedExperiment_1.20.0 openssl_1.4.3
[101] withr_2.4.1 GenomicAlignments_1.26.0 Rsamtools_2.6.0 GenomeInfoDbData_1.2.4 hms_1.0.0
[106] rpart_4.1-15 grid_4.0.3 tidyr_1.1.2 rmarkdown_2.7 MatrixGenerics_1.2.1
[111] biovizBase_1.38.0 base64enc_0.1-3

Estherrr1997 commented 3 years ago

I am using Rstudio in Windows 10

This is the code I ran:

library(ASpli)

BAMFiles <- c("path_to_bams/CT_time1_rep1.BAM", "path_to_bams/CT_time1_rep2.BAM", "path_to_bams/CT_time2_rep1.BAM", "path_to_bams/CT_time2_rep2.BAM", "path_to_bams/TR_time1_rep1.BAM", "path_to_bams/TR_time1_rep2.BAM", "path_to_bams/TR_time2_rep1.BAM", "path_to_bams/TR_time2_rep2.BAM") (targets <- data.frame( bam = BAMFiles, genotype = c( 'CT', 'CT', 'CT', 'CT', 'TR', 'TR', 'TR', 'TR' ), time = c( 't1', 't1', 't2', 't2', 't1', 't1', 't2', 't2' ), stringsAsFactors = FALSE ))

getConditions( targets )

library( GenomicFeatures ) gtfFileName <- aspliExampleGTF() genomeTxDb <- makeTxDbFromGFF( gtfFileName ) features <- binGenome( genomeTxDb )

BAMFiles <- aspliExampleBamList()

targets <- data.frame( row.names = paste0('Sample',c(1:12)), bam = BAMFiles, f1 = c( 'A','A','A','A','A','A', 'B','B','B','B','B','B'), f2 = c( 'C','C','C','D','D','D', 'C','C','C','D','D','D'), stringsAsFactors = FALSE)

getConditions(targets) mBAMs <- data.frame(bam = sub("_[02]","",targets$bam[c(1,4,7,10)]), condition= c("A_C","A_D","B_C","B_D"))

gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0)

asd<- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0, threshold = 5, minAnchor = 10)

gb <- gbDUreport(gbcounts,contrast = c( 1, -1, -1, 1 ) )

genesDE(gb)[1:5,] binsDU(gb)[1:5,]

jdur <- jDUreport(asd, contrast = c( 1, -1, -1, 1 ))

localej(jdur)[1:5,1:8]

sr <- splicingReport(gb, jdur, counts = gbcounts) is <- integrateSignals(sr,asd)

exportIntegratedSignals( is, output.dir="aspliExample", sr, gbcounts, features, asd, mBAMs)

exportIntegratedSignals( is, output.dir="is", sr, gbcounts, features, asd, mBAMs, jCompletelyIncluded = FALSE, zoomRegion = 1.5, useLog = FALSE, tcex = 1, ntop = NULL, openInBrowser = T, makeGraphs = TRUE, bforce=FALSE )

form <- formula(~f1+f2+f1:f2)

model.matrix(form,targets)

estepi commented 3 years ago

Yes, please, samtools should be installed. Anyway I'm checking your error in detail and tell you something soon. Also, we did update our vignette and remove unuseful code. We are pushing it too devel branch in Bioconductor

Estherrr1997 commented 3 years ago

I do have samtools installed on my laptop, do I have to pass the path to where samtools is installed? I dont know how to fix this error, I really hope you can help me

estepi commented 3 years ago

Hi Esther, 2 ideas:

1.- Since you are working on Windows, please check you have correct permissions on your output folders and also If samtools is running correctly.

https://support.bioconductor.org/p/98783/

2.- We prepared a new repo for the updated version of ASpli. Here also you can find a brief example:

https://github.com/achernom/ASpli

Can you check it and tell me something?

thanks

Estherrr1997 commented 3 years ago

Thanks, I fixed the samtools error and used the code from the new ASpli version. I did not have a 'txdb.sqlite' file so I created one, I ran the following:

if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")

BiocManager::install("ASpli")

library(ASpli)

download.file("https://www.encodeproject.org/files/gencode.v24.primary_assembly.annotation/@@download/gencode.v24.primary_assembly.annotation.gtf.gz", "gencode.v24.primary_assembly.annotation.gtf.gz") library(GenomicFeatures) txdb = makeTxDbFromGFF('gencode.v24.primary_assembly.annotation.gtf.gz')

library(AnnotationDbi) saveDb(txdb, 'txdb.sqlite')

genome <- loadDb("txdb.sqlite") features <- binGenome(genome) targets <- data.frame(bam = c("CT_1.BAM", "CT_2.BAM","CT_3.BAM", "TR_1.BAM", "TR_2.BAM", "TR_3.BAM"), genotype = c( "CT", "CT", "CT", "TR", "TR", "TR" ), stringsAsFactors = FALSE ) mBAMs <- data.frame(bam=c("CT.BAM", "TR.BAM"),condition=c("CT","TR"))

counts <- gbCounts(features = features, targets = targets, minReadLength = 125L, maxISize = 50000)

and got the next message: Summarizing CT_1 Error in value[3L] : failed to open BamFile: file(s) do not exist: 'CT_1.BAM'

Do I have to download the test data somewhere?

thankyou

estepi commented 3 years ago

Hi Esther, test data is a generic example... you should replace with your own bam files. Dont forget to have .bai files in same folder! I'll be waiting for your news,

Estherrr1997 commented 3 years ago

Okay thanks! I have my own txdb.sqlite file and want to test ASpli on like 2 to 6 bam files. Do I have to change the code to something like this?:

targets <- data.frame(bam = c("first.bam", "second.bam"), genotype = c( "CT", "TR" ), stringsAsFactors = FALSE ) mBAMs <- data.frame(bam=c("CT.BAM", "TR.BAM"),condition=c("CT","TR")) counts <- gbCounts(features = features, targets = targets, minReadLength = 125L, maxISize = 50000)

please correct me if this is not the right way to do this

thank you so much for helping me, really appreciate it :)

estepi commented 3 years ago

You need to use duplicates Do you have them?

Estherrr1997 commented 3 years ago

I am working with patient data; the bam files contain information from different people with the same mutation, is it possible to use this information/data as "duplicates"? If so, how do I load the bam files in the correct way?

estepi commented 3 years ago

You should decide what you want to analyze ... Do you have control samples (i.e no mutations?)

Estherrr1997 commented 3 years ago

No we only have data from patients with mutations; we would like to compare these mutations and see the different splice events that occur. I thought maybe I can use 3 bam files from mutation X and compare that with 3 bam files from mutation Y. Is that possible with ASpli?

estepi commented 3 years ago

You can try and see if you can detect differences in AS. Where is your mutation? I think It should be something really disruptive if you want to see effects on AS...

Estherrr1997 commented 3 years ago

Yes I really want to try that, we are currently investigating the AS events in leukemia and we would like to visualize it too. Do the bam files have to be named "CT_X.BAM/TR_X.BAM" in order to use: genotype = c( "CT", "CT", "CT", "TR", "TR", "TR" ), stringsAsFactors = FALSE )?

Estherrr1997 commented 3 years ago

I tried running ASpli on my own data, it seemed to go well but then I got this error:

-Differential signal estimation:

gb <- gbDUreport(counts, contrast = c(-1, 1)) Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'gbDUreport' for signature '"standardGeneric"'

what does the "gbDUreport(counts, contrast = c(-1, 1))" do?

Esther

estepi commented 3 years ago

it does the differential gene expression and bin usage signal estimation Run: 1: class(counts) 2.- gb <- gbDUreport(gbcounts) (not specifying contrast) thanks

estepi commented 3 years ago

It seems counts is not correct. Can you check?

estepi commented 3 years ago

counts object seems to be ok. Maybe there is an error in the script. Can you paste it? thanks

estepi commented 3 years ago

gb <- gbDUreport(counts, contrast = c(-1, 1)) should be the correct one

Can you show me how do you define the targets data.frame? How do you define conditions here?

PILOT RUN targets_test <- data.frame( ID = bam_ids, bam = bam_fnames, stringsAsFactors = F ) targets_test <- merge(targets_test, aml_classes_sub, by = 'ID') targets_test <- tibble::column_to_rownames(targets_test, 'ID') targets_test

txs!

estepi commented 3 years ago

Hi Esther, yes.

You should use: short_001 and short_002 as condition/phenotype

You can fix using: targets$genotype<-rep(c("short_001","short_002"), each=3)

I'm waiting for your results

Estefi

estepi commented 3 years ago

I suspect from your annotation. What do you see as output when you run: xdb = makeTxDbFromGFF('gencode.v24.primary_assembly.annotation.gtf.gz')

You can find this info in a log file inside your directory

estepi commented 3 years ago

Yes, I refer to this log filean

Can you run this code ? txdb = makeTxDbFromGFF('gencode.v24.primary_assembly.annotation.gtf.gz')

thanks

estepi commented 3 years ago

This error means you dont have the file in your directory: "No such file or directory"

Is this your annotation file? Can you find the correct path?

estepi commented 3 years ago

You should generate an appropiate genome DB. If you don't have exons in your annotation file, you won't be able to quantify alternative splicing.

Try to find the original GFT you've used to build the TxDb

Estherrr1997 commented 3 years ago

I guess there is something wrong with my TxDb.. I am trying to generate the splice graphs using the data available from https://bioconductor.org/packages/release/bioc/vignettes/ASpli/inst/doc/ASpli.R and ran this:

library(Rsamtools)

library( ASpli ) library( GenomicFeatures ) gtfFileName <- aspliExampleGTF() genomeTxDb <- makeTxDbFromGFF( gtfFileName ) features <- binGenome( genomeTxDb )

BAMFiles <- aspliExampleBamList() targets <- data.frame( row.names = paste0('Sample',c(1:12)), bam = BAMFiles, f1 = c( 'A','A','A','A','A','A', 'B','B','B','B','B','B'), f2 = c( 'C','C','C','D','D','D', 'C','C','C','D','D','D'), stringsAsFactors = FALSE) getConditions(targets) ###################################################

code chunk number 37: Ex02.f

################################################### mBAMs <- data.frame(bam = sub("_[02]","",targets$bam[c(1,4,7,10)]), condition= c("A_C","A_D","B_C","B_D"))

###################################################

code chunk number 38: Ex02.i

################################################### gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0)

asd<- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0, threshold = 5, minAnchor = 10) ###################################################

code chunk number 40: Ex02.k

################################################### gb <- gbDUreport(gbcounts,contrast = c( 1, -1, -1, 1 ) )

###################################################

code chunk number 41: Ex02.k2

################################################### genesDE(gb)[1:5,] binsDU(gb)[1:5,]

###################################################

code chunk number 42: Ex02.l

################################################### jdur <- jDUreport(asd, contrast = c( 1, -1, -1, 1 ))

###################################################

code chunk number 43: Ex02.kk (eval = FALSE)

###################################################

localec(jdur)[1:5,]

###################################################

code chunk number 44: Ex02.kkk

################################################### localej(jdur)[1:5,1:8]

###################################################

code chunk number 45: Ex02.m

################################################### sr <- splicingReport(gb, jdur, counts = gbcounts) is <- integrateSignals(sr,asd)

###################################################

code chunk number 46: Ex03

################################################### exportIntegratedSignals( is, output.dir="aspliExample", sr, gbcounts, features, asd, mBAMs)

###################################################

code chunk number 47: Ex04

################################################### form <- formula(~f1+f2+f1:f2)

###################################################

code chunk number 48: Ex05

################################################### model.matrix(form,targets)

This all seems to work fine, although I get this message again when trying to generate the graphs:

exportIntegratedSignals( is, output.dir="aspliExample",

  • sr, gbcounts, features, asd, mBAMs) Generating graphs... | | 0%sh: samtools: command not found |======= | 7%sh: samtools: command not found |============== | 14%sh: samtools: command not found |===================== | 21%sh: samtools: command not found |=========================== | 29%sh: samtools: command not found |================================== | 36%sh: samtools: command not found |========================================= | 43%sh: samtools: command not found |================================================ | 50%sh: samtools: command not found |======================================================= | 57%sh: samtools: command not found |============================================================== | 64%sh: samtools: command not found |===================================================================== | 71%sh: samtools: command not found |=========================================================================== | 79%sh: samtools: command not found |================================================================================== | 86%sh: samtools: command not found |========================================================================================= | 93%sh: samtools: command not found |================================================================================================| 100%sh: samtools: command not found

I've loaded samtools on my pc, in the same directory, do you know what causes this error?