Open Estherrr1997 opened 3 years ago
same question
Hi all, Im sorry for the delay. Can you paste your R session? In which OS are you working? thanks !
Thank you for your reply. I run it in ubuntu 20 I tried with my own data and the pipe runs well without error. But in the function of exportIntegratedSignals, no plot was generated.
sessionInfo() R version 4.0.3 (2020-10-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.2 LTS
Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=zh_CN.UTF-8 LC_NUMERIC=C LC_TIME=zh_CN.UTF-8 LC_COLLATE=zh_CN.UTF-8 LC_MONETARY=zh_CN.UTF-8
[6] LC_MESSAGES=zh_CN.UTF-8 LC_PAPER=zh_CN.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] DT_0.17 GenomicFeatures_1.42.1 ASpli_2.0.0 edgeR_3.32.1
[5] limma_3.46.0 ggplot2_3.3.3 pheatmap_1.0.12 doParallel_1.0.16
[9] iterators_1.0.13 foreach_1.5.1 NMF_0.23.0 cluster_2.1.0
[13] rngtools_1.5 pkgmaker_0.32.2 registry_0.5-1 BSgenome.Hsapiens.UCSC.hg38_1.4.3
[17] BSgenome_1.58.0 rtracklayer_1.50.0 Biostrings_2.58.0 XVector_0.30.0
[21] GenomicRanges_1.42.0 GenomeInfoDb_1.26.2 maftools_2.6.05 BiocParallel_1.24.1
[25] GSEABase_1.52.1 graph_1.68.0 annotate_1.68.0 XML_3.99-0.5
[29] AnnotationDbi_1.52.0 IRanges_2.24.1 S4Vectors_0.28.1 Biobase_2.50.0
[33] BiocGenerics_0.36.0 GSVA_1.38.2
loaded via a namespace (and not attached):
[1] backports_1.2.1 Hmisc_4.5-0 BiocFileCache_1.14.0 plyr_1.8.6 igraph_1.2.6
[6] lazyeval_0.2.2 splines_4.0.3 crosstalk_1.1.1 gridBase_0.4-7 digest_0.6.27
[11] ensembldb_2.14.0 htmltools_0.5.1.1 checkmate_2.0.0 magrittr_2.0.1 memoise_2.0.0
[16] remotes_2.2.0 matrixStats_0.58.0 vroom_1.4.0 askpass_1.1 prettyunits_1.1.1
[21] jpeg_0.1-8.1 colorspace_2.0-0 blob_1.2.1 rappdirs_0.3.3 xfun_0.21
[26] dplyr_1.0.4 jsonlite_1.7.2 crayon_1.4.1 RCurl_1.98-1.2 VariantAnnotation_1.36.0
[31] survival_3.2-7 glue_1.4.2 gtable_0.3.0 zlibbioc_1.36.0 UpSetR_1.4.0
[36] DelayedArray_0.16.2 scales_1.1.1 DBI_1.1.1 Rcpp_1.0.6 htmlTable_2.1.0
[41] xtable_1.8-4 progress_1.2.2 foreign_0.8-81 bit_4.0.4 Formula_1.2-4
[46] htmlwidgets_1.5.3 httr_1.4.2 RColorBrewer_1.1-2 ellipsis_0.3.1 pkgconfig_2.0.3
[51] farver_2.1.0 Gviz_1.34.0 nnet_7.3-15 dbplyr_2.1.0 locfit_1.5-9.4
[56] tidyselect_1.1.0 labeling_0.4.2 rlang_0.4.10 reshape2_1.4.4 munsell_0.5.0
[61] tools_4.0.3 cachem_1.0.4 generics_0.1.0 RSQLite_2.2.3 evaluate_0.14
[66] stringr_1.4.0 fastmap_1.1.0 yaml_2.2.1 knitr_1.31 bit64_4.0.5
[71] purrr_0.3.4 AnnotationFilter_1.14.0 xml2_1.3.2 biomaRt_2.46.3 BiocStyle_2.18.1
[76] compiler_4.0.3 rstudioapi_0.13 curl_4.3 png_0.1-7 statmod_1.4.35
[81] tibble_3.1.0 stringi_1.5.3 lattice_0.20-41 ProtGenerics_1.22.0 Matrix_1.3-2
[86] vctrs_0.3.6 pillar_1.5.0 lifecycle_1.0.0 BiocManager_1.30.10 data.table_1.14.0
[91] bitops_1.0-6 R6_2.5.0 latticeExtra_0.6-29 gridExtra_2.3 codetools_0.2-18
[96] dichromat_2.0-0 MASS_7.3-53 assertthat_0.2.1 SummarizedExperiment_1.20.0 openssl_1.4.3
[101] withr_2.4.1 GenomicAlignments_1.26.0 Rsamtools_2.6.0 GenomeInfoDbData_1.2.4 hms_1.0.0
[106] rpart_4.1-15 grid_4.0.3 tidyr_1.1.2 rmarkdown_2.7 MatrixGenerics_1.2.1
[111] biovizBase_1.38.0 base64enc_0.1-3
I am using Rstudio in Windows 10
This is the code I ran:
library(ASpli)
BAMFiles <- c("path_to_bams/CT_time1_rep1.BAM", "path_to_bams/CT_time1_rep2.BAM", "path_to_bams/CT_time2_rep1.BAM", "path_to_bams/CT_time2_rep2.BAM", "path_to_bams/TR_time1_rep1.BAM", "path_to_bams/TR_time1_rep2.BAM", "path_to_bams/TR_time2_rep1.BAM", "path_to_bams/TR_time2_rep2.BAM") (targets <- data.frame( bam = BAMFiles, genotype = c( 'CT', 'CT', 'CT', 'CT', 'TR', 'TR', 'TR', 'TR' ), time = c( 't1', 't1', 't2', 't2', 't1', 't1', 't2', 't2' ), stringsAsFactors = FALSE ))
getConditions( targets )
library( GenomicFeatures ) gtfFileName <- aspliExampleGTF() genomeTxDb <- makeTxDbFromGFF( gtfFileName ) features <- binGenome( genomeTxDb )
BAMFiles <- aspliExampleBamList()
targets <- data.frame( row.names = paste0('Sample',c(1:12)), bam = BAMFiles, f1 = c( 'A','A','A','A','A','A', 'B','B','B','B','B','B'), f2 = c( 'C','C','C','D','D','D', 'C','C','C','D','D','D'), stringsAsFactors = FALSE)
getConditions(targets) mBAMs <- data.frame(bam = sub("_[02]","",targets$bam[c(1,4,7,10)]), condition= c("A_C","A_D","B_C","B_D"))
gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0)
asd<- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0, threshold = 5, minAnchor = 10)
gb <- gbDUreport(gbcounts,contrast = c( 1, -1, -1, 1 ) )
genesDE(gb)[1:5,] binsDU(gb)[1:5,]
jdur <- jDUreport(asd, contrast = c( 1, -1, -1, 1 ))
localej(jdur)[1:5,1:8]
sr <- splicingReport(gb, jdur, counts = gbcounts) is <- integrateSignals(sr,asd)
exportIntegratedSignals( is, output.dir="aspliExample", sr, gbcounts, features, asd, mBAMs)
exportIntegratedSignals( is, output.dir="is", sr, gbcounts, features, asd, mBAMs, jCompletelyIncluded = FALSE, zoomRegion = 1.5, useLog = FALSE, tcex = 1, ntop = NULL, openInBrowser = T, makeGraphs = TRUE, bforce=FALSE )
form <- formula(~f1+f2+f1:f2)
model.matrix(form,targets)
Yes, please, samtools should be installed. Anyway I'm checking your error in detail and tell you something soon. Also, we did update our vignette and remove unuseful code. We are pushing it too devel branch in Bioconductor
I do have samtools installed on my laptop, do I have to pass the path to where samtools is installed? I dont know how to fix this error, I really hope you can help me
Hi Esther, 2 ideas:
1.- Since you are working on Windows, please check you have correct permissions on your output folders and also If samtools is running correctly.
https://support.bioconductor.org/p/98783/
2.- We prepared a new repo for the updated version of ASpli. Here also you can find a brief example:
https://github.com/achernom/ASpli
Can you check it and tell me something?
thanks
Thanks, I fixed the samtools error and used the code from the new ASpli version. I did not have a 'txdb.sqlite' file so I created one, I ran the following:
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install("ASpli")
library(ASpli)
download.file("https://www.encodeproject.org/files/gencode.v24.primary_assembly.annotation/@@download/gencode.v24.primary_assembly.annotation.gtf.gz", "gencode.v24.primary_assembly.annotation.gtf.gz") library(GenomicFeatures) txdb = makeTxDbFromGFF('gencode.v24.primary_assembly.annotation.gtf.gz')
library(AnnotationDbi) saveDb(txdb, 'txdb.sqlite')
genome <- loadDb("txdb.sqlite") features <- binGenome(genome) targets <- data.frame(bam = c("CT_1.BAM", "CT_2.BAM","CT_3.BAM", "TR_1.BAM", "TR_2.BAM", "TR_3.BAM"), genotype = c( "CT", "CT", "CT", "TR", "TR", "TR" ), stringsAsFactors = FALSE ) mBAMs <- data.frame(bam=c("CT.BAM", "TR.BAM"),condition=c("CT","TR"))
counts <- gbCounts(features = features, targets = targets, minReadLength = 125L, maxISize = 50000)
and got the next message: Summarizing CT_1 Error in value[3L] : failed to open BamFile: file(s) do not exist: 'CT_1.BAM'
Do I have to download the test data somewhere?
thankyou
Hi Esther, test data is a generic example... you should replace with your own bam files. Dont forget to have .bai files in same folder! I'll be waiting for your news,
Okay thanks! I have my own txdb.sqlite file and want to test ASpli on like 2 to 6 bam files. Do I have to change the code to something like this?:
targets <- data.frame(bam = c("first.bam", "second.bam"), genotype = c( "CT", "TR" ), stringsAsFactors = FALSE ) mBAMs <- data.frame(bam=c("CT.BAM", "TR.BAM"),condition=c("CT","TR")) counts <- gbCounts(features = features, targets = targets, minReadLength = 125L, maxISize = 50000)
please correct me if this is not the right way to do this
thank you so much for helping me, really appreciate it :)
You need to use duplicates Do you have them?
I am working with patient data; the bam files contain information from different people with the same mutation, is it possible to use this information/data as "duplicates"? If so, how do I load the bam files in the correct way?
You should decide what you want to analyze ... Do you have control samples (i.e no mutations?)
No we only have data from patients with mutations; we would like to compare these mutations and see the different splice events that occur. I thought maybe I can use 3 bam files from mutation X and compare that with 3 bam files from mutation Y. Is that possible with ASpli?
You can try and see if you can detect differences in AS. Where is your mutation? I think It should be something really disruptive if you want to see effects on AS...
Yes I really want to try that, we are currently investigating the AS events in leukemia and we would like to visualize it too. Do the bam files have to be named "CT_X.BAM/TR_X.BAM" in order to use: genotype = c( "CT", "CT", "CT", "TR", "TR", "TR" ), stringsAsFactors = FALSE )?
I tried running ASpli on my own data, it seemed to go well but then I got this error:
-Differential signal estimation:
gb <- gbDUreport(counts, contrast = c(-1, 1)) Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'gbDUreport' for signature '"standardGeneric"'
what does the "gbDUreport(counts, contrast = c(-1, 1))" do?
Esther
it does the differential gene expression and bin usage signal estimation Run: 1: class(counts) 2.- gb <- gbDUreport(gbcounts) (not specifying contrast) thanks
It seems counts is not correct. Can you check?
counts object seems to be ok. Maybe there is an error in the script. Can you paste it? thanks
gb <- gbDUreport(counts, contrast = c(-1, 1)) should be the correct one
Can you show me how do you define the targets data.frame? How do you define conditions here?
PILOT RUN targets_test <- data.frame( ID = bam_ids, bam = bam_fnames, stringsAsFactors = F ) targets_test <- merge(targets_test, aml_classes_sub, by = 'ID') targets_test <- tibble::column_to_rownames(targets_test, 'ID') targets_test
txs!
Hi Esther, yes.
You should use: short_001 and short_002 as condition/phenotype
You can fix using: targets$genotype<-rep(c("short_001","short_002"), each=3)
I'm waiting for your results
Estefi
I suspect from your annotation. What do you see as output when you run: xdb = makeTxDbFromGFF('gencode.v24.primary_assembly.annotation.gtf.gz')
You can find this info in a log file inside your directory
Yes, I refer to this log filean
Can you run this code ? txdb = makeTxDbFromGFF('gencode.v24.primary_assembly.annotation.gtf.gz')
thanks
This error means you dont have the file in your directory: "No such file or directory"
Is this your annotation file? Can you find the correct path?
You should generate an appropiate genome DB. If you don't have exons in your annotation file, you won't be able to quantify alternative splicing.
Try to find the original GFT you've used to build the TxDb
I guess there is something wrong with my TxDb.. I am trying to generate the splice graphs using the data available from https://bioconductor.org/packages/release/bioc/vignettes/ASpli/inst/doc/ASpli.R and ran this:
library( ASpli ) library( GenomicFeatures ) gtfFileName <- aspliExampleGTF() genomeTxDb <- makeTxDbFromGFF( gtfFileName ) features <- binGenome( genomeTxDb )
BAMFiles <- aspliExampleBamList() targets <- data.frame( row.names = paste0('Sample',c(1:12)), bam = BAMFiles, f1 = c( 'A','A','A','A','A','A', 'B','B','B','B','B','B'), f2 = c( 'C','C','C','D','D','D', 'C','C','C','D','D','D'), stringsAsFactors = FALSE) getConditions(targets) ###################################################
################################################### mBAMs <- data.frame(bam = sub("_[02]","",targets$bam[c(1,4,7,10)]), condition= c("A_C","A_D","B_C","B_D"))
###################################################
################################################### gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0)
asd<- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0, threshold = 5, minAnchor = 10) ###################################################
################################################### gb <- gbDUreport(gbcounts,contrast = c( 1, -1, -1, 1 ) )
###################################################
################################################### genesDE(gb)[1:5,] binsDU(gb)[1:5,]
###################################################
################################################### jdur <- jDUreport(asd, contrast = c( 1, -1, -1, 1 ))
###################################################
###################################################
###################################################
################################################### localej(jdur)[1:5,1:8]
###################################################
################################################### sr <- splicingReport(gb, jdur, counts = gbcounts) is <- integrateSignals(sr,asd)
###################################################
################################################### exportIntegratedSignals( is, output.dir="aspliExample", sr, gbcounts, features, asd, mBAMs)
###################################################
################################################### form <- formula(~f1+f2+f1:f2)
###################################################
################################################### model.matrix(form,targets)
This all seems to work fine, although I get this message again when trying to generate the graphs:
exportIntegratedSignals( is, output.dir="aspliExample",
- sr, gbcounts, features, asd, mBAMs) Generating graphs... | | 0%sh: samtools: command not found |======= | 7%sh: samtools: command not found |============== | 14%sh: samtools: command not found |===================== | 21%sh: samtools: command not found |=========================== | 29%sh: samtools: command not found |================================== | 36%sh: samtools: command not found |========================================= | 43%sh: samtools: command not found |================================================ | 50%sh: samtools: command not found |======================================================= | 57%sh: samtools: command not found |============================================================== | 64%sh: samtools: command not found |===================================================================== | 71%sh: samtools: command not found |=========================================================================== | 79%sh: samtools: command not found |================================================================================== | 86%sh: samtools: command not found |========================================================================================= | 93%sh: samtools: command not found |================================================================================================| 100%sh: samtools: command not found
I've loaded samtools on my pc, in the same directory, do you know what causes this error?
Hi,
I have followed the ASpli tutorial from: https://rdrr.io/bioc/ASpli/src/inst/doc/ASpli.R and ran the commands (the ones that weren't commented out), although, when I try to visualize the alternative splicing events using exportIntegratedSignals, it returns an html page in which I can click on the "View" buttons; this returns only a small black block and not the output as described in figure 8 in: https://www.bioconductor.org/packages/devel/bioc/vignettes/ASpli/inst/doc/ASpli.pdf Is there a way to fix this?
Also, do the bam files come from the ASpli package or do I have to download them?
Kind regards, Esther