Inferring CNV from Single-Cell RNA-Seq
Errors when ran BayesNet with diagnostics = TRUE #252

liu-xingliang commented 4 years ago

Error messages:

        STEP 18: Run Bayesian Network Model on HMM predicted CNV's

INFO [2020-08-24 12:04:12] Creating the following Directory:  TTK/BayesNetOutput.HMMi6.hmm_mode-samples
INFO [2020-08-24 12:04:12] Initializing new MCM InferCNV Object.
INFO [2020-08-24 12:04:12] validating infercnv_obj
INFO [2020-08-24 12:04:13] Total CNV's:  1360
INFO [2020-08-24 12:04:13] Loading BUGS Model.
INFO [2020-08-24 12:04:15] Running Sampling Using Parallel with  8 Cores

INFO [2020-08-24 13:51:15] Obtaining probabilities post-sampling
INFO [2020-08-24 14:04:05] Gibbs sampling time:  119.831170745691  Minutes
INFO [2020-08-24 14:05:17] Creating Diagnostic Plots.
Error in lapply(seq_along(mcmc), function(i) { :
  argument "mcmc" is missing, with no default

Session info:

> sessionInfo()
R version 4.0.1 (2020-06-06)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /home/anaconda3/envs/R401/lib/

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] infercnv_1.4.0

loaded via a namespace (and not attached):
 [1] Biobase_2.48.0              tidyr_1.1.0
 [3] edgeR_3.30.3                jsonlite_1.6.1
 [5] splines_4.0.1               foreach_1.5.0
 [7] gtools_3.8.2                argparse_2.0.1
 [9] stats4_4.0.1                HiddenMarkov_1.8-11
[11] coin_1.3-1                  GenomeInfoDbData_1.2.3
[13] globals_0.12.5              pillar_1.4.4
[15] lattice_0.20-41             glue_1.4.1
[17] limma_3.44.2                digest_0.6.25
[19] GenomicRanges_1.40.0        RColorBrewer_1.1-2
[21] XVector_0.28.0              colorspace_1.4-1
[23] sandwich_2.5-1              plyr_1.8.6
[25] Matrix_1.2-18               pkgconfig_2.0.3
[27] listenv_0.8.0               zlibbioc_1.34.0
[29] purrr_0.3.4                 mvtnorm_1.1-1
[31] scales_1.1.1                gdata_2.18.0
[33] rjags_4-10                  tibble_3.0.1
[35] generics_0.0.2              IRanges_2.22.2
[37] ggplot2_3.3.1               ellipsis_0.3.1
[39] TH.data_1.0-10              SummarizedExperiment_1.18.1
[41] fastcluster_1.1.25          BiocGenerics_0.34.0
[43] survival_3.1-12             magrittr_1.5
[45] crayon_1.3.4                future_1.17.0
[47] doParallel_1.0.15           nlme_3.1-148
[49] MASS_7.3-51.6               gplots_3.0.3
[51] tools_4.0.1                 fitdistrplus_1.1-1
[53] formatR_1.7                 lifecycle_0.2.0
[55] matrixStats_0.56.0          multcomp_1.4-13
[57] S4Vectors_0.26.1            findpython_1.0.5
[59] munsell_0.5.0               locfit_1.5-9.4
[61] DelayedArray_0.14.0         lambda.r_1.2.4
[63] compiler_4.0.1              GenomeInfoDb_1.24.0
[65] caTools_1.18.0              rlang_0.4.6
[67] futile.logger_1.4.3         grid_4.0.1
[69] RCurl_1.98-1.2              iterators_1.0.12
[71] SingleCellExperiment_1.10.1 bitops_1.0-6
[73] gtable_0.3.0                codetools_0.2-16
[75] reshape_0.8.8               R6_2.4.1
[77] gridExtra_2.3               zoo_1.8-8
[79] dplyr_1.0.0                 libcoin_1.0-5
[81] futile.options_1.0.1        KernSmooth_2.23-17
[83] ape_5.4                     modeltools_0.2-23
[85] parallel_4.0.1              Rcpp_1.0.4.6
[87] vctrs_0.3.1                 tidyselect_1.1.0
[89] coda_0.19-3

liu-xingliang commented 4 years ago

Running command:

infercnv_obj = CreateInfercnvObject(raw_counts_matrix=paste0('../../rerun.v140.meganormal/', patient, '.', tp, ".matrix"), annotations_file=paste0('../', patient, '.', tp, ".anno"), delim="\t", gene_order_file="../../../refdata-cellranger-GRCh38-3.0.0.gene_pos.chr_prefix.txt", ref_group_names = c('N_L', 'N_RB'))
infercnv_obj = infercnv::run(
            infercnv_obj, cutoff=0.1, out_dir=paste0(patient, '.', tp), cluster_by_groups=T, denoise=T, resume_mode=F, no_prelim_plot = FALSE, HMM=TRUE, BayesMaxPNormal=0.5, diagnostics = TRUE, num_threads=8
GeorgescuC commented 4 years ago

Hi @liuxl18-hku ,

Could you try updating to the github version of the code by running:

devtools::install_github("broadinstitute/infercnv", ref="RELEASE_3_10")

And then try to run the second command again?

Regards, Christophe.

liu-xingliang commented 4 years ago

@GeorgescuC ,

Thank you. After installing using your command, sessionInfo() shows infercnv_1.2.2, it's even lower than my previous version infercnv_1.4.0 which I got from Bioconductor.

I am re-running my commands using re-installed version.


GeorgescuC commented 4 years ago

Hi @liuxl18-hku

My bad, I copied the commands from the wiki but forgot to change the target branch. If you run the following, the version should show 1.4.0 again, but it has additional commits compared to the version on BioConductor.

devtools::install_github("broadinstitute/infercnv", ref="master")

Regards, Christophe.

liu-xingliang commented 4 years ago

@GeorgescuC ,

Nevermind, I just updated my installation ending with infercnv_1.5.0. Now I am now testing it on a 19838 gene x 8315 cells dataset with four cores for running Bayesian Network Model on HMM predicted CNV's with 250Gb memory allocated.

It seems to take quite a while to run. I didn't use more cores as it seems to consume more memory and made my last few test running crashes.

Thank you.

GeorgescuC commented 4 years ago

Hi @liuxl18-hku ,

For 8315 cells and 19838 genes, 250GB of memory should be more than enough. Are you running on a cluster grid, or using a tool that could limit the memory allowed for a job (such as running in docker)?

Parallelization is enabled for the random trees subclustering if that option is enabled, and the Bayesian filtering only at this time.

Regards, Christophe.

liu-xingliang commented 4 years ago


Yes, thank u, thd job done successfully after a few hrs.