broadinstitute / infercnv

Inferring CNV from Single-Cell RNA-Seq
Other
564 stars 166 forks source link

inferCNV : Error in obs_dendrogram[[1]] : subscript out of bounds #472

Open ly55 opened 2 years ago

ly55 commented 2 years ago
STEP 15: Clustering samples (not defining tumor subclusters)

INFO [2022-10-26 23:06:39] define_signif_tumor_subclusters(p_val=0.1 INFO [2022-10-26 23:06:39] define_signif_tumor_subclusters(), tumor: Tumor cells INFO [2022-10-26 23:06:39] cut tree into: 1 groups INFO [2022-10-26 23:06:39] -processing Tumor cells,Tumor cells_s1 INFO [2022-10-26 23:06:40] ::plot_cnv:Start INFO [2022-10-26 23:06:40] ::plot_cnv:Current data dimensions (r,c)=4302,100 Total=430591.808773644 Min=0.792982987475879 Max=1.27722856888669. INFO [2022-10-26 23:06:40] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2022-10-26 23:06:41] plot_cnv(): auto thresholding at: (0.882492 , 1.119330) INFO [2022-10-26 23:06:41] plot_cnv_observation:Start INFO [2022-10-26 23:06:41] Observation data size: Cells= 0 Genes= 4302 Error in obs_dendrogram[[1]] : subscript out of bounds In addition: Warning message: In max(nchar(obs_annotations_names)) : no non-missing arguments to max; returning -Inf

GeorgescuC commented 2 years ago

Hi @ly55 ,

The log messages show that there are no cells defined as observations, meaning all your 100 cells are defined as references. Infercnv expects at least some cells to be observations as there is nothing to compare otherwise.

Best, Christophe.

jgarces02 commented 2 years ago

Hi, I have the same error but I think reference and tumor cells are correctly named. This is the code I'm running and, as you can see, I've got references and tumoral cells for each sample, nothing has zero cells...

> table(all_merge_ff_annot@meta.data$normal_tumor)

normal  tumor
 13770  18115

> table(all_merge_ff_annot@meta.data$normal_tumor, all_merge_ff_annot@meta.data$orig.ident)

     DS1  DS2  DS3 mm10 mm11  mm2  mm3  mm5  mm6  mm7  mm8  mm9
normal 1419 2601 3701 2496  129   26   73   51  121   13   99 3041
tumor    32   30   72  951 5423  435 1463 2667 1873 1488 3085  596

> infercnv_obj <- CreateInfercnvObject(raw_counts_matrix = "mymtx.mtx", 
     annotations_file = "myannot.txt", delim = "\t", 
     gene_order_file = "gene_ordering_file.me.txt", 
     ref_group_names = c("normal", "tumor"))

INFO [2022-11-01 14:58:41] Parsing matrix: all_merge_ff_annot.infercnv.20221020.mtx
INFO [2022-11-01 15:04:00] Parsing gene order file: gene_ordering_file.me.infercnv.txt
INFO [2022-11-01 15:04:00] Parsing cell annotations file: all_merge_ff_annot.metadata.20221024.infercnv.txt
INFO [2022-11-01 15:04:00] ::order_reduce:Start.
INFO [2022-11-01 15:04:04] .order_reduce(): expr and order match.
INFO [2022-11-01 15:04:08] ::process_data:order_reduce:Reduction from positional data, new dimensions (r,c) = 36601,31885 Total=228917337 Min=0 Max=26389.
INFO [2022-11-01 15:04:11] num genes removed taking into account provided gene ordering list: 383 = 1.04641949673506% removed.
INFO [2022-11-01 15:04:13] -filtering out cells < 100 or > Inf, removing 0 % of cells
INFO [2022-11-01 15:04:46] validating infercnv_obj

> infercnv_obj <- run(infercnv_obj, num_threads = 20, out_dir = "mydir",
      cluster_by_groups = T, cluster_references = F, cutoff = 0.1,
      HMM = T, HMM_type = "i6", 
      analysis_mode = "subclusters", tumor_subcluster_partition_method = "leiden", tumor_subcluster_pval = 0.25,
      denoise = T)

INFO [2022-11-01 15:06:52] ::process_data:Start
INFO [2022-11-01 15:06:52] Creating output path all_merge_ff_annot.infercnv.20221101
INFO [2022-11-01 15:06:52] Checking for saved results.
INFO [2022-11-01 15:06:52]

        STEP 1: incoming data

INFO [2022-11-01 15:08:44]

        STEP 02: Removing lowly expressed genes

INFO [2022-11-01 15:08:44] ::above_min_mean_expr_cutoff:Start
INFO [2022-11-01 15:08:48] Removing 31036 genes from matrix as below mean expr threshold: 0.1
INFO [2022-11-01 15:08:49] validating infercnv_obj
INFO [2022-11-01 15:08:49] There are 5182 genes and 31885 cells remaining in the expr matrix.
INFO [2022-11-01 15:08:56] no genes removed due to min cells/gene filter

        STEP 03: normalization by sequencing depth

INFO [2022-11-01 15:09:47] normalizing counts matrix by depth
INFO [2022-11-01 15:09:51] Computed total sum normalization factor as median libsize: 4277.000000
INFO [2022-11-01 15:09:52] Adding h-spike
INFO [2022-11-01 15:09:52] -hspike modeling of normal
INFO [2022-11-01 15:11:23] -hspike modeling of tumor
INFO [2022-11-01 15:13:05] validating infercnv_obj
INFO [2022-11-01 15:13:05] normalizing counts matrix by depth
INFO [2022-11-01 15:13:05] Using specified normalization factor: 4277.000000
INFO [2022-11-01 15:14:01]

        STEP 04: log transformation of data

INFO [2022-11-01 15:14:01] transforming log2xplus1()
INFO [2022-11-01 15:14:06] -mirroring for hspike
INFO [2022-11-01 15:14:06] transforming log2xplus1()
INFO [2022-11-01 15:15:02]

        STEP 08: removing average of reference data (before smoothing)

INFO [2022-11-01 15:15:02] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE
INFO [2022-11-01 15:15:02] subtracting mean(normal) per gene per cell across all data
INFO [2022-11-01 15:15:11] -subtracting expr per gene, use_bounds=TRUE
INFO [2022-11-01 15:15:24] -mirroring for hspike
INFO [2022-11-01 15:15:24] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE
INFO [2022-11-01 15:15:24] subtracting mean(normal) per gene per cell across all data
INFO [2022-11-01 15:15:26] -subtracting expr per gene, use_bounds=TRUE
INFO [2022-11-01 15:17:03]

        STEP 09: apply max centered expression threshold: 3

INFO [2022-11-01 15:17:03] ::process_data:setting max centered expr, threshold set to: +/-:  3
INFO [2022-11-01 15:17:06] -mirroring for hspike
INFO [2022-11-01 15:17:06] ::process_data:setting max centered expr, threshold set to: +/-:  3
INFO [2022-11-01 15:18:41]

        STEP 10: Smoothing data per cell by chromosome

INFO [2022-11-01 15:18:41] smooth_by_chromosome: chr: chr1p
INFO [2022-11-01 15:19:19] smooth_by_chromosome: chr: chr1q
INFO [2022-11-01 15:20:05] smooth_by_chromosome: chr: chr2p
(...)
INFO [2022-11-01 15:39:28] smooth_by_chromosome: chr: chr_E
INFO [2022-11-01 15:39:29] smooth_by_chromosome: chr: chr_3pt0
INFO [2022-11-01 15:39:29] smooth_by_chromosome: chr: chr_F
INFO [2022-11-01 15:41:06]

        STEP 11: re-centering data across chromosome after smoothing

INFO [2022-11-01 15:41:06] ::center_smooth across chromosomes per cell
INFO [2022-11-01 15:41:38] -mirroring for hspike
INFO [2022-11-01 15:41:38] ::center_smooth across chromosomes per cell
INFO [2022-11-01 15:43:18]

        STEP 12: removing average of reference data (after smoothing)

INFO [2022-11-01 15:43:18] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE
INFO [2022-11-01 15:43:18] subtracting mean(normal) per gene per cell across all data
INFO [2022-11-01 15:43:29] -subtracting expr per gene, use_bounds=TRUE
INFO [2022-11-01 15:43:44] -mirroring for hspike
INFO [2022-11-01 15:43:44] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE
INFO [2022-11-01 15:43:44] subtracting mean(normal) per gene per cell across all data
INFO [2022-11-01 15:43:46] -subtracting expr per gene, use_bounds=TRUE
INFO [2022-11-01 15:45:16]

        STEP 14: invert log2(FC) to FC

INFO [2022-11-01 15:45:16] invert_log2(), computing 2^x
INFO [2022-11-01 15:45:27] -mirroring for hspike
INFO [2022-11-01 15:45:27] invert_log2(), computing 2^x
INFO [2022-11-01 15:47:30]

        STEP 15: computing tumor subclusters via leiden

INFO [2022-11-01 15:47:30] define_signif_tumor_subclusters(p_val=0.25
INFO [2022-11-01 15:47:30] define_signif_tumor_subclusters(), tumor: normal
INFO [2022-11-01 16:56:57] define_signif_tumor_subclusters(), tumor: tumor
INFO [2022-11-01 18:14:35] -mirroring for hspike
INFO [2022-11-01 18:14:35] define_signif_tumor_subclusters(p_val=0.25
INFO [2022-11-01 18:14:35] define_signif_tumor_subclusters(), tumor: spike_tumor_cell_normal
INFO [2022-11-01 18:14:35] define_signif_tumor_subclusters(), tumor: spike_tumor_cell_tumor
INFO [2022-11-01 18:14:36] define_signif_tumor_subclusters(), tumor: simnorm_cell_normal
INFO [2022-11-01 18:14:36] define_signif_tumor_subclusters(), tumor: simnorm_cell_tumor
INFO [2022-11-01 18:18:43] ::plot_cnv:Start
INFO [2022-11-01 18:18:43] ::plot_cnv:Current data dimensions (r,c)=5182,31885 Total=165420330.51334 Min=0.694857310986877 Max=7.04179348625365.
INFO [2022-11-01 18:18:45] ::plot_cnv:Depending on the size of the matrix this may take a moment.
INFO [2022-11-01 18:22:34] plot_cnv(): auto thresholding at: (0.873265 , 1.129062)
INFO [2022-11-01 18:22:43] plot_cnv_observation:Start
INFO [2022-11-01 18:22:43] Observation data size: Cells= 0 Genes= 5182
Error in obs_dendrogram[[1]] : subscript out of bounds
In addition: There were 50 or more warnings (use warnings() to see the first 50)

Any idea, please?

GeorgescuC commented 1 year ago

Hi @jgarces02 ,

In your CreateInfercnvObject() call, you have set ref_group_names = c("normal", "tumor"), so both your normal and tumor cells are defined as references, leaving no group of cells as observations according to the table of identities you also posted. The setting should be simply set to ref_group_names = c("normal").

Regards, Christophe.

jgarces02 commented 1 year ago

Yep, I realized. Thanks!