broadinstitute / infercnv

Inferring CNV from Single-Cell RNA-Seq
Other
557 stars 164 forks source link

Error at Step 18 of infercnv::run when validating infercnv object #450

Closed zvittorio closed 2 years ago

zvittorio commented 2 years ago

Hello everyone,

I am trying to run inferCNV on a tumor sample of 13201 cells that is loaded as a Seurat object.

` library(Seurat) library(infercnv) library(tidyverse) library(ggpubr) library(Matrix) library(stringr) library(data.table) library(readxl) library(readr) library(gridExtra)

seuratObj <- readRDS("SeuratObject_After_ScaterSeuratDGEAnnot.rds") counts <- as.matrix(seuratObj@assays$RNA@counts, drop=F)

cell_annotation <- Idents(seuratObj)

infercnv_obj = CreateInfercnvObject(raw_counts_matrix=counts[,,drop=F], annotations_file=as.data.frame(cell_annotation), gene_order_file="projects/gencodev21_gene_pos.txt", ref_group_names=c("Fibroblasts"))

infercnv_obj = infercnv::run(infercnv_obj, cutoff=0.1, out_dir="infercnv_run5_gencodev21/",
cluster_by_groups=TRUE, denoise=TRUE, HMM=TRUE, num_threads=18, HMM_type = "i6", output_format = "pdf", resume_mode = T ) ` But I always get:

STEP 18: Run Bayesian Network Model on HMM predicted CNVs

INFO [2022-08-27 17:19:28] Initializing new MCM InferCNV Object. INFO [2022-08-27 17:19:28] validating infercnv_obj Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 11875 did not have 2 elements

The only input file that 2 elements in every line is the annotation file. So I checked line 11875 but it looks completely fine, so do the neighboring lines. I check the same line of all other input files, directly from the infercnv_object, so that I can exclude the problem being generated when creating the infercnv object. They all look fine. I also tried to check the source code for the function validate_infercnvobject but all the conditions are met. What I also do not understand is why the error is raised so late in infercnv::run. If it is actually related to the annotation file missing an element, shouldn't the tool complain much earlier, like directly when creating the infercnv object (I assume...). Oh, I also included drop=F whenever possible but that did not change a thing. Now, the problem is that I am quite new to the tool and I do not have a complete understanding of how HMMs work in the frame of this analysis, so what the tool actually does at that step really beats me. Apologies.

PS I am running R 4.2.0 with inferCNV 1.12.0, which has always worked in our lab.

I would be extremely grateful to anyone that can provide help!

Best

GeorgescuC commented 2 years ago

Hi @zvittorio ,

I am not sure why this error would happen at this point either from just the log. Issues with the input files should indeed raise issues earlier than that. Have you perhaps updated infercnv after running some of the steps? It might be worth a try to just rerun infercnv in a different folder to see if the issue is consistent. If the issue persists, would you be able to share your data privately?

Step 18 runs the Bayesian model which can find more details about on the wiki.

Regards, Christophe.

zvittorio commented 2 years ago

Thank you, Cristophe. I actually managed to circumvent the error. I found out later in the analysis that the sample I was loading was actually a duplicate of another one. In my opinion this created conflicts with the content of the files that inferCNV was processing from the working directory. After submitting the correct cells to the tool the error did not pop up again, so I guess I can close the issue.

Thank you again

Best

Vittorio