Inferring CNV from Single-Cell RNA-Seq
Error at STEP 17: HMM-based CNV prediction #242

ccruizm commented 4 years ago

Good day,

I am having an error when reaching step 17. I am using 10x genomics v3 data and use the raw count matrix as input. Below the code I used and the error:


infercnv_obj = infercnv::run(infercnv_obj,
                             cluster_by_groups=F,   # cluster
                             num_threads = 10)


INFO [2020-06-21 09:08:47] ::process_data:Start
INFO [2020-06-21 09:08:47] Checking for saved results.
INFO [2020-06-21 09:08:47] Trying to reload from step 15
INFO [2020-06-21 09:09:07] Using backup from step 15
INFO [2020-06-21 09:09:07] 

    STEP 1: incoming data

INFO [2020-06-21 09:09:07] 

    STEP 02: Removing lowly expressed genes

INFO [2020-06-21 09:09:07] 

    STEP 03: normalization by sequencing depth

INFO [2020-06-21 09:09:07] 

    STEP 04: log transformation of data

INFO [2020-06-21 09:09:07] 

    STEP 08: removing average of reference data (before smoothing)

INFO [2020-06-21 09:09:07] 

    STEP 09: apply max centered expression threshold: 3

INFO [2020-06-21 09:09:07] 

    STEP 10: Smoothing data per cell by chromosome

INFO [2020-06-21 09:09:07] 

    STEP 11: re-centering data across chromosome after smoothing

INFO [2020-06-21 09:09:07] 

    STEP 12: removing average of reference data (after smoothing)

INFO [2020-06-21 09:09:07] 

    STEP 14: invert log2(FC) to FC

INFO [2020-06-21 09:09:07] 

    STEP 15: Clustering samples (not defining tumor subclusters)

INFO [2020-06-21 09:09:07] 

    STEP 17: HMM-based CNV prediction

INFO [2020-06-21 09:09:07] predict_CNV_via_HMM_on_whole_tumor_samples
Error in, y, offset = offset, singular.ok = singular.ok, ...): NA/NaN/Inf in 'y'

1. infercnv::run(infercnv_obj, cutoff = 0.1, out_dir = "output_inferCNV_with-ref2", 
 .     cluster_by_groups = F, denoise = F, HMM = T, num_threads = 10)
2. predict_CNV_via_HMM_on_whole_tumor_samples(infercnv_obj, t = HMM_transition_prob)
3. lapply(chrs, function(chr) {
 .     chr_gene_idx = which(gene_order$chr == chr)
 .     lapply(tumor_samples, function(tumor_sample_cells_idx) {
 .         gene_expr_vals = rowMeans([chr_gene_idx, tumor_sample_cells_idx, 
 .             drop = FALSE])
 .         num_cells = length(tumor_sample_cells_idx)
 .         state_emission_params <- .get_state_emission_params(num_cells, 
 .             cnv_mean_sd, cnv_level_to_mean_sd_fit)
 .         hmm <- HiddenMarkov::dthmm(gene_expr_vals, HMM_info[["state_transitions"]], 
 .             HMM_info[["delta"]], "norm", state_emission_params)
 .         hmm_trace <- Viterbi.dthmm.adj(hmm)
 .[chr_gene_idx, tumor_sample_cells_idx] <<- hmm_trace
 .     })
 . })
4. FUN(X[[i]], ...)
5. lapply(tumor_samples, function(tumor_sample_cells_idx) {
 .     gene_expr_vals = rowMeans([chr_gene_idx, tumor_sample_cells_idx, 
 .         drop = FALSE])
 .     num_cells = length(tumor_sample_cells_idx)
 .     state_emission_params <- .get_state_emission_params(num_cells, 
 .         cnv_mean_sd, cnv_level_to_mean_sd_fit)
 .     hmm <- HiddenMarkov::dthmm(gene_expr_vals, HMM_info[["state_transitions"]], 
 .         HMM_info[["delta"]], "norm", state_emission_params)
 .     hmm_trace <- Viterbi.dthmm.adj(hmm)
 .[chr_gene_idx, tumor_sample_cells_idx] <<- hmm_trace
 . })
6. FUN(X[[i]], ...)
7. .get_state_emission_params(num_cells, cnv_mean_sd, cnv_level_to_mean_sd_fit)
8. get_hspike_cnv_mean_sd_trend_by_num_cells_fit(infercnv_obj@.hspike)
9. lapply(tmp_names, function(cnv_level) {
 .     sd_vals = cnv_level_to_mean_sd[[cnv_level]]
 .     num_cells = seq_along(sd_vals)
 .     fit = lm(log(sd_vals) ~ log(num_cells))
 .     fit
 . })
10. FUN(X[[i]], ...)
11. lm(log(sd_vals) ~ log(num_cells))
12., y, offset = offset, singular.ok = singular.ok, ...)

Session info:

R version 4.0.0 (2020-04-24)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Gentoo/Linux

Matrix products: default
BLAS/LAPACK: /home/cruiz/anaconda3/envs/r_env_4.0/lib/

[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] cowplot_1.0.0  future_1.17.0  dplyr_1.0.0    Seurat_3.1.5   infercnv_1.4.0

Where do you think the problem might be?

Thanks in advance!

GeorgescuC commented 4 years ago

Hi @ccruizm ,

I am not sure why there would be NA/NaN/Inf at this step in the process. Does the preliminary plot look normal? Could you try rerunning things from the start by either using resume_mode=FALSE or emptying the output folder? If the error still occurs, I will probably need to debug things using the data.

Regards, Christophe.

mmfalco commented 3 years ago

I'm having same problem here with version 1.7.1 of the package. Strangely it was solved when using a base::matrix object class instead of the sparse Matrix class. So when creating the infercnv object I did:

infercnv_obj = CreateInfercnvObject(as.matrix(counts_matrix),

And it worked.

I think this has to do with the problems I've been having with the " cluster_by_groups=F" argument in the infercnv::run() function.

MasonDou commented 11 months ago


Hi @GeorgescuC , I meet same problem in step 17 and same error information. The weird part is when I use "out_dir= tempfile()" the function runs well, however, when I put a folder name, the bug just appear. Looking forward to your reply, thank you!

brianjohnhaas commented 11 months ago

Hi all - we've had a lapse in funding towards infercnv and have limited resources for tech support. Hopefully that changes, but in the meantime, we don't have resources to provide tech support.

We'll put up a banner about this sometime soon. In the meantime, hopefully users can help each other out.

