broadinstitute / infercnv

Inferring CNV from Single-Cell RNA-Seq
Other
565 stars 166 forks source link

Error in step 15 #545

Open kangjiajinlong opened 1 year ago

kangjiajinlong commented 1 year ago

Hi infercnv team,

I am running infercnv with the following command:

options(scipen = 100)
infercnv_obj <- infercnv::CreateInfercnvObject(raw_counts_matrix="infercnv/input_raw_count_deconv/expr_inter.txt",
                                               annotations_file="infercnv/input_raw_count_deconv/anno_inter.txt",
                                               delim="\t",
                                               gene_order_file="infercnv/input_raw_count_deconv/gene_pos.txt",
                                               ref_group_names="ref")
infercnv_obj <- infercnv::run(infercnv_obj,
                              cutoff=0.1,
                              out_dir="infercnv/output_raw_count",
                              cluster_by_groups=FALSE,
                              analysis_mode="subclusters",
                              #tumor_subcluster_pval=0.001,
                              leiden_resolution=0.01,
                              k_obs_groups=30,
                              denoise=T,
                              HMM=T,
                              HMM_type="i6",
                              num_threads=6)

However, I am stuck at step 15 and got the following error: INFO [2023-05-22 16:24:09] Parsing matrix: infercnv/input_raw_coundeconv/expr_inter.txt INFO [2023-05-22 16:24:38] Parsing gene order file: infercnv/input_raw_coun__deconv/gene_pos.txt INFO [2023-05-22 16:24:38] Parsing cell annotations file: infercnv/input_raw_coundeconv/anno_inter.txt INFO [2023-05-22 16:24:38] ::order_reduce:Start. INFO [2023-05-22 16:24:38] .order_reduce(): expr and order match. INFO [2023-05-22 16:24:39] ::process_data:order_reduce:Reduction from positional data, new dimensions (r,c) = 14801,6861 Total=136279155 Min=0 Max=2653. INFO [2023-05-22 16:24:39] -filtering out cells < 100 or > Inf, removing 0.0874508 % of cells INFO [2023-05-22 16:24:40] validating infercnv_obj INFO [2023-05-22 16:24:41] ::process_data:Start INFO [2023-05-22 16:24:41] Checking for saved results. INFO [2023-05-22 16:24:41]

STEP 1: incoming data

INFO [2023-05-22 16:25:03]

STEP 02: Removing lowly expressed genes

INFO [2023-05-22 16:25:03] ::above_min_mean_expr_cutoff:Start INFO [2023-05-22 16:25:03] Removing 3255 genes from matrix as below mean expr threshold: 0.1 INFO [2023-05-22 16:25:04] validating infercnv_obj INFO [2023-05-22 16:25:04] There are 11546 genes and 6855 cells remaining in the expr matrix. INFO [2023-05-22 16:25:06] no genes removed due to min cells/gene filter INFO [2023-05-22 16:25:26]

STEP 03: normalization by sequencing depth

INFO [2023-05-22 16:25:26] normalizing counts matrix by depth INFO [2023-05-22 16:25:27] Computed total sum normalization factor as median libsize: 18506.000000 INFO [2023-05-22 16:25:28] Adding h-spike INFO [2023-05-22 16:25:28] -hspike modeling of ref INFO [2023-05-22 16:26:39] validating infercnv_obj INFO [2023-05-22 16:26:39] normalizing counts matrix by depth INFO [2023-05-22 16:26:39] Using specified normalization factor: 18506.000000 INFO [2023-05-22 16:27:02]

STEP 04: log transformation of data

INFO [2023-05-22 16:27:02] transforming log2xplus1() INFO [2023-05-22 16:27:04] -mirroring for hspike INFO [2023-05-22 16:27:04] transforming log2xplus1() INFO [2023-05-22 16:27:26]

STEP 08: removing average of reference data (before smoothing)

INFO [2023-05-22 16:27:26] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE INFO [2023-05-22 16:27:26] subtracting mean(normal) per gene per cell across all data INFO [2023-05-22 16:27:30] -subtracting expr per gene, use_bounds=TRUE INFO [2023-05-22 16:27:36] -mirroring for hspike INFO [2023-05-22 16:27:36] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE INFO [2023-05-22 16:27:36] subtracting mean(normal) per gene per cell across all data INFO [2023-05-22 16:27:41] -subtracting expr per gene, use_bounds=TRUE INFO [2023-05-22 16:28:20]

STEP 09: apply max centered expression threshold: 3

INFO [2023-05-22 16:28:20] ::process_data:setting max centered expr, threshold set to: +/-: 3 INFO [2023-05-22 16:28:21] -mirroring for hspike INFO [2023-05-22 16:28:21] ::process_data:setting max centered expr, threshold set to: +/-: 3 INFO [2023-05-22 16:29:02]

STEP 10: Smoothing data per cell by chromosome

INFO [2023-05-22 16:29:02] smooth_by_chromosome: chr: 1 INFO [2023-05-22 16:29:08] smooth_by_chromosome: chr: 2 INFO [2023-05-22 16:29:13] smooth_by_chromosome: chr: 3 INFO [2023-05-22 16:29:18] smooth_by_chromosome: chr: 4 INFO [2023-05-22 16:29:23] smooth_by_chromosome: chr: 5 INFO [2023-05-22 16:29:27] smooth_by_chromosome: chr: 6 INFO [2023-05-22 16:29:31] smooth_by_chromosome: chr: 7 INFO [2023-05-22 16:29:36] smooth_by_chromosome: chr: 8 INFO [2023-05-22 16:29:39] smooth_by_chromosome: chr: 9 INFO [2023-05-22 16:29:44] smooth_by_chromosome: chr: 10 INFO [2023-05-22 16:29:48] smooth_by_chromosome: chr: 11 INFO [2023-05-22 16:29:53] smooth_by_chromosome: chr: 12 INFO [2023-05-22 16:29:57] smooth_by_chromosome: chr: 13 INFO [2023-05-22 16:30:01] smooth_by_chromosome: chr: 14 INFO [2023-05-22 16:30:05] smooth_by_chromosome: chr: 15 INFO [2023-05-22 16:30:09] smooth_by_chromosome: chr: 16 INFO [2023-05-22 16:30:13] smooth_by_chromosome: chr: 17 INFO [2023-05-22 16:30:18] smooth_by_chromosome: chr: 18 INFO [2023-05-22 16:30:21] smooth_by_chromosome: chr: 19 INFO [2023-05-22 16:30:26] smooth_by_chromosome: chr: 20 INFO [2023-05-22 16:30:30] smooth_by_chromosome: chr: 21 INFO [2023-05-22 16:30:33] smooth_by_chromosome: chr: 22 INFO [2023-05-22 16:30:36] smooth_by_chromosome: chr: 23 INFO [2023-05-22 16:30:40] -mirroring for hspike INFO [2023-05-22 16:30:40] smooth_by_chromosome: chr: chrA INFO [2023-05-22 16:30:40] smooth_by_chromosome: chr: chr_0 INFO [2023-05-22 16:30:41] smooth_by_chromosome: chr: chr_B INFO [2023-05-22 16:30:41] smooth_by_chromosome: chr: chr_0pt5 INFO [2023-05-22 16:30:41] smooth_by_chromosome: chr: chr_C INFO [2023-05-22 16:30:41] smooth_by_chromosome: chr: chr_1pt5 INFO [2023-05-22 16:30:41] smooth_by_chromosome: chr: chr_D INFO [2023-05-22 16:30:41] smooth_by_chromosome: chr: chr_2pt0 INFO [2023-05-22 16:30:42] smooth_by_chromosome: chr: chr_E INFO [2023-05-22 16:30:42] smooth_by_chromosome: chr: chr_3pt0 INFO [2023-05-22 16:30:42] smooth_by_chromosome: chr: chr_F INFO [2023-05-22 16:31:23]

STEP 11: re-centering data across chromosome after smoothing

INFO [2023-05-22 16:31:23] ::center_smooth across chromosomes per cell INFO [2023-05-22 16:31:39] -mirroring for hspike INFO [2023-05-22 16:31:39] ::center_smooth across chromosomes per cell INFO [2023-05-22 16:32:22]

STEP 12: removing average of reference data (after smoothing)

INFO [2023-05-22 16:32:22] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE INFO [2023-05-22 16:32:22] subtracting mean(normal) per gene per cell across all data INFO [2023-05-22 16:32:26] -subtracting expr per gene, use_bounds=TRUE INFO [2023-05-22 16:32:34] -mirroring for hspike INFO [2023-05-22 16:32:34] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE INFO [2023-05-22 16:32:34] subtracting mean(normal) per gene per cell across all data INFO [2023-05-22 16:32:39] -subtracting expr per gene, use_bounds=TRUE INFO [2023-05-22 16:33:20]

STEP 14: invert log2(FC) to FC

INFO [2023-05-22 16:33:20] invert_log2(), computing 2^x INFO [2023-05-22 16:33:27] -mirroring for hspike INFO [2023-05-22 16:33:27] invert_log2(), computing 2^x INFO [2023-05-22 16:34:11]

STEP 15: computing tumor subclusters via leiden

INFO [2023-05-22 16:34:11] define_signif_tumor_subclusters(p_val=0.1 INFO [2023-05-22 16:34:12] define_signif_tumor_subclusters(), tumor: allobservations Counts matrix provided is not sparse. Creating V5 assay in Seurat Object. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| Warning: No layers found matching search pattern provided Centering and scaling data matrix |======================================================================| 100% PC 1 Positive: RBFOX3, ENGASE, CBX8, CBX4, TBC1D16, C1QTNF1, CCDC40, CANT1, GAA, LGALS3BP EIF4A3, SGSH, TIMP2, SLC26A11, USP36, CYTH1, RNF213, PGS1, ENDOV, SOCS3 NPTX1, BIRC5, RPTOR, TK1, CHMP6, SYNGR2, TMC8, TMC6, TNRC6C, SEC14L1 Negative: OCIAD2, DCUN1D4, OCIAD1, LRRC66, FRYL, SLC10A4, SGCB, SLAIN2, USP46, NFXL1 SCFD2, ATP10D, FIP1L1, COMMD8, GABRB1, LNX1, GABRA4, CHIC2, GABRA2, PDGFRA KIT, GABRG1, GNPDA2, GUF1, ATP8A1, SLC30A9, OSGIN2, RIPK2, NBN, MMP16 PC 2 Positive: DCLRE1C, ACBD7, SUV39H2, HSPA14, RPP38, FAM107B, NMT2, FRMD4A, FAM171A1, PRPF18 PTER, BEND7, RSU1, SEPHS1, CUBN, PHYH, TRDMT1, OPTN, VIM, ST8SIA6 CCDC3, STAM, CAMK1D, MRC1, CDC123, SH3BP5, METTL6, SLC39A12, CAPN7, EAF1 Negative: PLOD3, VGF, BCL2L1, TPX2, ID1, DUSP15, AP1S1, TTLL9, HM13, PDRG1 ZNF337, CCM2L, SERPINE1, HCK, TM9SF4, NANP, TRIM56, PLAGL2, POFUT1, NINL ACHE, KIF3B, ASXL1, SRRT, NOL4L, TRIP6, SLC12A9, SP4, ITGB8, TWISTNB PC 3 Positive: STIM1, ZNF549, ZNF550, ZNF416, ZNF772, ZIK1, RRM1, CAMSAP1, ZNF134, VN1R1 ZNF211, KCNT1, ZNF671, ZNF776, TRIM21, ZNF586, C9orf116, TRIM68, ZNF552, ZNF587B PPP1R26, ZNF587, HBB, ZNF256, OLFM1, C19orf18, ZNF606, ZNF135, COL5A1, RXRA Negative: BCAS3, PPM1D, TBX2, INTS2, APPBP2, MED13, METTL2A, USP32, TLK2, CA4 MRC2, TANC2, HEATR6, CYB561, ACE, RNFT1, DCAF7, TUBD1, TACO1, VMP1 MAP3K3, PTRH2, LIMD2, CLTC, STRADA, DHX40, CCDC47, YPEL2, DDX42, GDPD1 PC 4 Positive: COL5A2, COL3A1, WDR75, SLC40A1, ASNSD1, GULP1, ANKAR, OSGEPL1, TFPI, ORMDL1 PMS1, CALCRL, C2orf88, HIBCH, INPP1, FAM171B, MFSD6, ITGAV, NEMP2, ZC3H15 NAB1, DUSP19, GLS, STAT1, NCKAP1, STAT4, FRZB, MYO1B, DNAJC10, NABP1 Negative: B9D1, EPN2, MAPK7, PRPSAP2, MFAP4, RNF112, SHMT1, SMCR8, SLC47A1, TOP3A MIEF2, FLII, LLGL1, ALDH3A2, DRG2, ALKBH5, GID4, ATPAF2, ULK2, TOM1L2 SREBF1, AKAP10, RAI1, PEMT, SPECC1, RASD1, USP22, MED9, DHRS7B, NT5M PC 5 Positive: EIF1, TMEM99, HAP1, KRT10, JUP, KRT222, P3H4, SMARCE1, FKBP10, IGFBP4 NT5C3B, TOP2A, KLHL11, RARA, ACLY, CDC6, TTC25, WIPF2, CNP, RAPGEFL1 DNAJC7, CASC3, NKIRAS2, MSL1, DHX58, NR1D1, KAT2A, THRA, RAB5C, MED24 Negative: HEATR6, RNFT1, CA4, TUBD1, USP32, VMP1, APPBP2, PPM1D, PTRH2, BCAS3 CLTC, TBX2, DHX40, INTS2, MED13, YPEL2, METTL2A, TLK2, GDPD1, MRC2 TANC2, SMG8, CYB561, ACE, PRR11, DCAF7, TACO1, MAP3K3, LIMD2, STRADA Computing nearest neighbor graph Computing SNN INFO [2023-05-22 16:34:28] define_signif_tumorsubclusters(), tumor: ref Counts matrix provided is not sparse. Creating V5 assay in Seurat Object. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| Warning: No layers found matching search pattern provided Centering and scaling data matrix |======================================================================| 100% PC 1 Positive: MGRN1, NUDT16L1, HMOX2, UBALD1, ANKS3, ZNF500, ROGDI, NMRAL1, GLYR1, VASN UBN1, NAGPA, CORO7, ALG1, PAM16, EEF2KMT, ADCY9, RBFOX1, CREBBP, METTL22 TRAP1, ABAT, DNASE1, TMEM186, SLX4, PMM2, CLUAP1, CARHSP1, NAA60, USP7 Negative: HEATR6, RNFT1, CA4, TUBD1, GDPD1, SMG8, USP32, YPEL2, VMP1, PRR11 DHX40, PTRH2, APPBP2, CLTC, TRIM37, PPM1E, PPM1D, RAD51C, CUEDC1, VEZF1 MTMR4, MSI2, SUPT4H1, AKAP1, BCAS3, TSPOAP1, SRSF1, MKS1, SCPEP1, DYNLL2 PC 2 Positive: SCPEP1, COIL, AKAP1, TRIM25, MSI2, DGKE, CUEDC1, NOG, VEZF1, SRSF1 ANKFN1, PCTP, DYNLL2, MKS1, TSPOAP1, SUPT4H1, MTMR4, CA10, UTP18, RAD51C MBTD1, PPM1E, NME2, TRIM37, NME1, PRR11, SPAG9, SMG8, TOB1, LUC7L3 Negative: VWCE, SLC15A3, DDB1, TKFC, TMEM132A, CYB561A3, TMEM109, PRPF19, TMEM138, CCDC86 TMEM216, MS4A14, CPSF7, MS4A7, SYT7, DAGLA, MYRF, MS4A4A, TMEM258, FEN1 FADS2, MS4A4E, FADS1, MS4A6A, FADS3, RAB3IL1, STX3, BEST1, FTH1, PATL1 PC 3 Positive: GOLT1B, SPX, RECQL, PYROXD1, KCNJ8, LDHB, ABCC9, CMAS, ST8SIA1, C2CD5 ETNK1, SOX5, BCAT1, LRMP, CASC1, KRAS, RASSF8, BHLHE41, SSPN, ITPR2 POT1, ZNF800, FGFR1OP2, GPR37, GCC1, TMEM229A, ARF5, SND1, WASL, LRRC4 Negative: VWCE, DDB1, SLC15A3, TMEM132A, TKFC, TMEM109, CYB561A3, PRPF19, TMEM138, CCDC86 TMEM216, MS4A14, CPSF7, MS4A7, SYT7, MS4A4A, DAGLA, MS4A4E, MYRF, MS4A6A TMEM258, FEN1, STX3, FADS2, PATL1, FADS1, FADS3, OSBP, RAB3IL1, MPEG1 PC 4 Positive: DOCK3, MANF, MAPKAPK3, RBM15B, HEMK1, DCAF1, TMEM115, RAD54L2, CYB561D2, TEX264 NPRL2, RRP9, RASSF1, PARP3, PCBP4, TUSC2, ABHD14B, HYAL2, ABHD14A, IFRD2 ACY1, SEMA3B, DUSP7, GNAI2, ALAS1, SLC38A3, TWF2, SEMA3F, PPM1M, RBM5 Negative: TMEM216, CPSF7, TMEM138, CYB561A3, SYT7, TKFC, DDB1, DAGLA, VWCE, MYRF TMEM258, SLC15A3, TMEM132A, FEN1, TMEM109, FADS2, PRPF19, FADS1, CCDC86, FADS3 RAB3IL1, MS4A14, BEST1, MS4A7, FTH1, MS4A4A, INCENP, MS4A4E, ASRGL1, MS4A6A PC 5 Positive: AHCYL2, SMO, SLC35B4, TSPAN33, AKR1B1, TNPO3, BPGM, CALD1, IRF5, TMEM140 ATP6V1F, WDR91, FLNC, CNOT4, CCDC136, NUP205, CALU, MTPN, FAM71F2, PTN DGKI, HILPDA, CREB3L2, RBM28, TRIM24, KIAA1549, LRRC4, ZC3HAV1, SND1, TTC26 Negative: AMPD2, GSTM3, CSF1, GNAI3, AHCYL1, AMIGO1, STRIP1, CYB561D1, SLC6A17, PSMA5 SORT1, KCNC4, PSRC1, RBM15, CELSR2, SARS, SLC16A4, KIAA1324, LAMTOR5, TMEM167B TAF13, KCNA2, WDR47, CD53, CLCC1, LRIF1, GPSM2, STXBP3, DRAM2, PRPF38B Computing nearest neighbor graph Computing SNN INFO [2023-05-22 16:34:32] -mirroring for hspike INFO [2023-05-22 16:34:32] define_signif_tumor_subclusters(p_val=0.1 INFO [2023-05-22 16:34:32] define_signif_tumor_subclusters(), tumor: spike_tumor_cell_ref INFO [2023-05-22 16:34:32] cut tree into: 1 groups INFO [2023-05-22 16:34:32] -processing spike_tumor_cell_ref,spike_tumor_cell_ref_s1 INFO [2023-05-22 16:34:32] define_signif_tumor_subclusters(), tumor: simnorm_cell_ref INFO [2023-05-22 16:34:32] cut tree into: 1 groups INFO [2023-05-22 16:34:32] -processing simnorm_cell_ref,simnorm_cell_ref_s1 INFO [2023-05-22 16:35:17] ::plot_cnv:Start INFO [2023-05-22 16:35:17] ::plot_cnv:Current data dimensions (r,c)=11546,6855 Total=79283899.5846625 Min=0.60213105797507 Max=1.81635580615788. INFO [2023-05-22 16:35:17] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2023-05-22 16:35:18] plot_cnv(): auto thresholding at: (0.731361 , 1.272077) INFO [2023-05-22 16:35:20] plot_cnv_observation:Start INFO [2023-05-22 16:35:20] Observation data size: Cells= 4742 Genes= 11546 INFO [2023-05-22 16:35:20] clustering observations via method: ward.D Error in seq_len(max(obs_annotations_groups)) : argument must be coercible to non-negative integer Calls: sourceWithProgress ... plot_subclusters -> plot_cnv -> .plot_cnv_observations In addition: Warning message: In max(nchar(obs_annotations_names)) : no non-missing arguments to max; returning -Inf Execution halted

I am very puzzled about the error message. Is there something wrong with my command? Thanks, Jack

GeorgescuC commented 1 year ago

Hi @kangjiajinlong ,

Are you on version 1.15.2 of infercnv? There was a bug prior to version 1.15.3 that affected the plot_subclusters() method (where your error happens) when using cluster_by_groups=FALSE, so if that is the case, simply updating to the new version 1.16.0 should solve the issue.

Regards, Christophe.