broadinstitute / infercnv

Inferring CNV from Single-Cell RNA-Seq
Other
557 stars 164 forks source link

A minor issue in CreateInfercnvObject function #455

Open vwtlin opened 2 years ago

vwtlin commented 2 years ago

Thanks for this powerful tool and I have been using it.

More recently 10x upgrades Cellranger/Spaceranger, which ends up generating the cell barcodes or spot barcodes with a hyphen and a number, for example, "GTTATATCAGGAGCCA-1". In this format, the CreateInfercnvObject fails to generate the object. As long as the hyphen and the number are removed in the count matrix and annotation files, it is fine.

It is just a small issue, requesting an additional step. It may also be troublesome to add the infercnv result back to the Seurat object directly. It would be great if this function could be solved.

Cheers, Weitao

GeorgescuC commented 2 years ago

Hi @vwtlin ,

I have tried modifying the example to include some cell names that have a dash followed by a number but that does not seem to break things. Could you provide a minimal example where the error happens?

An alternative in cases where infercnv cannot parse your inputs from files properly is to read the files yourself in R and provide the R objects to CreateInfercnvObject() directly.

Regards, Christophe.

vwtlin commented 1 year ago

Hi @GeorgescuC , sorry I missed your comment quite a while ago! Apology in advance if I could not write my comment appropriately with GitHub functions.

Briefly, the spaceranger version is 1.3.1. First of all, I use the output file from spaceranger for the Seurat spatial workflow to perform dimensionality reduction and clustering. Some codes and outputs to get the count matrix and annotation for infercnv are shown below.

# export the count matrix counts_matrix <- GetAssayData(visium, slot="counts") head(colnames(counts_matrix)) [1] "AAACAAGTATCTCCCA-1" "AAACAGAGCGACTCCT-1" "AAACATTTCCCGGATT-1" "AAACCCGAACGAAATC-1" "AAACCGGGTAGGTACC-1" "AAACCGTTCGTCCAGG-1" write.table(round(counts_matrix, digits=3), file='~/Desktop/Temp/1st/st.10x.counts.matrix', quote=F, sep="\t")

# export annotations visium@meta.data$cellID <- rownames(visium@meta.data) visium@meta.data$clusters <- paste0("cluster_", visium@meta.data$SCT_snn_res.0.2) write.table(visium@meta.data[, c("cellID", "clusters")], "~/Desktop/Temp/1st/cellAnnotations.txt", quote=F, sep="\t", col.names = F, row.names=F)

# create infercnv object infercnv_obj = CreateInfercnvObject(raw_counts_matrix = "~/Desktop/Temp/1st/st.10x.counts.matrix", annotations_file = "~/Desktop/Temp/1st/cellAnnotations.txt", delim = "\t", gene_order_file = "~/Desktop/Temp/gencode_v19_gene_pos.txt", ref_group_names = c("cluster_0")) INFO [2022-12-20 23:50:33] Parsing matrix: ~/Desktop/Temp/1st/st.10x.counts.matrix INFO [2022-12-20 23:50:44] Parsing gene order file: ~/Desktop/Temp/gencode_v19_gene_pos.txt INFO [2022-12-20 23:50:44] Parsing cell annotations file: ~/Desktop/Temp/1st/cellAnnotations.txt

Error in CreateInfercnvObject(raw_counts_matrix = "~/Desktop/Temp/1st/st.10x.counts.matrix", : Please make sure that all the annotated cell names match a sample in your data matrix. Attention to: AAACAAGTATCTCCCA-1,AAACAGAGCGACTCCT-1,AAACATTTCCCGGATT-1,AAACCCGAACGAAATC-1,AAACCGGGTAGGTACC-1,AAACCGTTCGTCCAGG-1,AAACCTCATGAAGTTG-1,AAACGAAGAACATACC-1,AAACGAGACGGTTGAT-1,AAACGCCCGAGATCGG-1,AAACTAACGTGGCGAC-1,AAACTCGTGATATAAG-1,AAACTGCTGGCTCCAA-1,AAAGACCCAAGTCGCG-1,AAAGACTGGGCGCTTT-1,AAAGGCTACGGACCAT-1,AAAGGCTCTCGCGCCG-1,AAAGGGATGTAGCAAG-1,AAAGGGCAGCTTGAAT-1,AAAGGTAAGCTGTACC-1,AAAGTCACTGATGTAA-1,AAAGTGTGATTTATCT-1,AAAGTTGACTCCCGTA-1,AAATAACCATACGGGA-1,AAATAAGGTAGTGCCC-1,AAATACCTATAAGCAT-1,AAATAGCTTAGACTTT-1,AAATAGGGTGCTATTG-1,AAATCGTGTACCACAA-1,AAATCTAGCCCTGCTA-1,AAATGATTCGATCAGC-1,AAATGCTCGTTACGTT-1,AAATGGCATGTCTTGT-1,AAATGGTCAATGTGCC-1,AAATTAACGGGTAGCT-1,AAATTACACGACTCTG-1,AAATTACCTATCGATG-1,AAATTCCAGGTCCAAA-1,AAATTGATAGTCCTTT-1,AAATTTACCGAAATCC-1,AAATTTGCGGGTGTGG-1,AACAACTGGTAGTTGC-1,AACAATACATTGTCGA-1,AACAATTACTCTACGC-1,AACACACGCTCGCCGC-1,AACACGACTGTACTGA-1,AACACGCGGCCGC

If I use gsub to remove the -1 in the colnames in the count matrix and the rownames in the meta.data before exporting the count matrix and the annotation txt, the error in the CreateInfercnvObject is gone. Not sure if I missed anything in the code above.

Cheers, Weitao

GeorgescuC commented 1 year ago

Hi @vwtlin ,

As I am still unable to reproduce the issue, I would verify by printing the start of the annotation file your generated that the "-1" are actually present in the cell names there too.

Otherwise, which version of infercnv are you using?

Regards, Christophe.

vwtlin commented 1 year ago

Hi @GeorgescuC,

I am using infercnv_1.12.0.

Attached is the screenshot of the annotation file. Screenshot 2023-01-22 at 11 11 02 pm

Cheers, Weitao