PoisonAlien / maftools

Summarize, Analyze and Visualize MAF files from TCGA or in-house studies.
http://bioconductor.org/packages/release/bioc/html/maftools.html
MIT License
447 stars 219 forks source link

trinucleotideMatrix questions #673

Closed liuyang218622 closed 3 years ago

liuyang218622 commented 3 years ago

Describe the issue how to get mutational signatures result by using maftools?

Command Please post your commands and the output (errors or any unexpected output) my inputfile is a maf file : The first three lines of the file is here:

Hugo_Symbol     Entrez_Gene_Id  Center  NCBI_Build      Chromosome      Start_Position  End_Position    Strand  Variant_Classification  Variant_Type    Reference_Allele        Tumor_Seq_Allele1       Tumor_Seq_Allele2       dbSNP_RSdbSNP_Val_Status Tumor_Sample_Barcode    Matched_Norm_Sample_Barcode     Match_Norm_Seq_Allele1  Match_Norm_Seq_Allele2  Tumor_Validation_Allele1
DVL1    1855    __UNKNOWN__     hg19    1       1275804 1275804 +       Missense_Mutation       SNP     G       G       A       144365982       byFrequency     b4386938_wes_t  b4386937_wes_b  __UNKNOWN__     __UNKNOWN__     __UNKNOWN__
Unknown         __UNKNOWN__     hg19    1       1900266 1900266 +       IGR     SNP     A       A       T                       b4386938_wes_t  b4386937_wes_b  __UNKNOWN__     __UNKNOWN__     __UNKNOWN__
PEX10   5192    __UNKNOWN__     hg19    1       2338106 2338106 +       Intron  SNP     C       C       A                       b4386938_wes_t  b4386937_wes_b  __UNKNOWN__     __UNKNOWN__     __UNKNOWN__

Then I runned:

rm(list=ls())
library(maftools)
options(stringsAsFactors =F)
annovar.laml<-read.maf('filtered_pass.annotation-v4.maf')
laml=annovar.laml
laml.tnm = trinucleotideMatrix(maf = laml, prefix = 'chr', add = TRUE, ref_genome = "BSgenome.Hsapiens.UCSC.hg19")

The error message is:

Extracting 5' and 3' adjacent bases..
Extracting +/- 20bp around mutated bases for background C>T estimation..
Estimating APOBEC enrichment scores..
Performing one-way Fisher's test for APOBEC enrichment..
Error in apply(X = apobec.fisher.dat, 1, function(x) { :
  dim(X) must have a positive length

Session info Run sessionInfo() and post the output below

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /Bailab5/liuyang/03.RNA-seq-data/06.TCGA/TCGA-GTex-OV-overlap/04.TCGA-OV-overlap/R-3.5.1/lib64/R/lib/libRblas.so
LAPACK: /Bailab5/liuyang/03.RNA-seq-data/06.TCGA/TCGA-GTex-OV-overlap/04.TCGA-OV-overlap/R-3.5.1/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.48.0
 [3] rtracklayer_1.42.2                Biostrings_2.50.2
 [5] XVector_0.22.0                    GenomicRanges_1.34.0
 [7] GenomeInfoDb_1.18.2               IRanges_2.16.0
 [9] S4Vectors_0.20.1                  maftools_1.8.10
[11] Biobase_2.42.0                    BiocGenerics_0.28.0

loaded via a namespace (and not attached):
 [1] splines_3.5.1               foreach_1.4.4
 [3] assertthat_0.2.1            GenomeInfoDbData_1.2.0
 [5] Rsamtools_1.34.1            ggrepel_0.8.0
 [7] pillar_1.4.3                lattice_0.20-35
 [9] glue_1.4.1                  digest_0.6.25
[11] RColorBrewer_1.1-2          colorspace_1.4-1
[13] cowplot_0.9.4               Matrix_1.2-14
[15] plyr_1.8.4                  XML_3.98-1.19
[17] pkgconfig_2.0.2             bibtex_0.4.2
[19] GetoptLong_0.1.7            zlibbioc_1.28.0
[21] purrr_0.3.3                 xtable_1.8-3
[23] scales_1.0.0                BiocParallel_1.16.6
[25] tibble_2.1.3                pkgmaker_0.27
[27] ggplot2_3.2.1               withr_2.1.2
[29] SummarizedExperiment_1.12.0 lazyeval_0.2.2
[31] survival_3.1-12             magrittr_1.5
[33] crayon_1.3.4                mclust_5.4.3
[35] doParallel_1.0.14           NMF_0.21.0
[37] tools_3.5.1                 registry_0.5-1
[39] data.table_1.12.2           GlobalOptions_0.1.0
[41] matrixStats_0.54.0          gridBase_0.4-7
[43] ComplexHeatmap_1.20.0       stringr_1.4.0
[45] munsell_0.5.0               cluster_2.0.7-1
[47] rngtools_1.3.1              DelayedArray_0.8.0
[49] compiler_3.5.1              rlang_0.4.2
[51] grid_3.5.1                  RCurl_1.95-4.12
[53] iterators_1.0.10            rjson_0.2.20
[55] circlize_0.4.5              bitops_1.0-6
[57] gtable_0.2.0                codetools_0.2-15
[59] reshape2_1.4.3              R6_2.4.0
[61] GenomicAlignments_1.18.1    gridExtra_2.3
[63] dplyr_0.8.3                 shape_1.4.4
[65] stringi_1.4.3               Rcpp_1.0.1
[67] wordcloud_2.6               tidyselect_0.2.5

so, can you tell me that why I have this error messages? The first five steps are all well

PoisonAlien commented 3 years ago

Hi, How many samples do you have? maftools works best with cohorts and ideally, you should have >3 samples (more the better).