broadinstitute / infercnv

Inferring CNV from Single-Cell RNA-Seq
Other
557 stars 164 forks source link

Error in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, : span is too small #474

Open yunbokai opened 1 year ago

yunbokai commented 1 year ago

Hi, I want to use inferCNV with my Visium data .The function is packaged in https://github.com/aerickso/SpatialInferCNV/ . I have 3 samples but 2 of them with the same error: "Error in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, : span is too small". This sounds like an error with the underlying infercnv dependency: not with spatialinfercnv functions.

Here is my output dir 图片

Best wishes.

GeorgescuC commented 1 year ago

Hi @yunbokai ,

This is related to the new way we process the data when running the Leiden subclustering, in which we run a PCA first. From looking at the SpatialInferCNV example, it appears there are very few "cells", which is not enough for the PCA to be computed on. You should be able to get around this by setting leiden_method="simple" in the options of infercnv::run(). It may also be worth testing out how the leiden_function and leiden_resolution affect the quality of the subclustering if it is useful for your setup, or if it might be better to run the HMM in cells mode instead.

Regards, Christophe.

yunbokai commented 1 year ago

Hi Christophe, sorry for reply late. Solving the error took me time. I tried many options including your advice. It's not work for my data to use leiden_method="simple" or change leiden_function and leiden_resolution. However, i use the code below and get a better result. To be honest, I'm not sure the code is suitable for 10X Visium Data, may be you can give me some advice again. infercnv::run(T91Cancer_infCNV, cutoff=0.1, out_dir="./InferCNVrun_outputs", cluster_by_groups=FALSE, HMM = F, analysis_mode='subclusters', tumor_subcluster_partition_method = "random_trees", num_threads = 64, denoise=TRUE ) Thanks again for your help!

GeorgescuC commented 1 year ago

Hi @yunbokai ,

If you are not using the HMM, the subclustering accuracy is not very important and you could even just run analysis_mode="samples" for speed. The hclust will still separate cells for the visualization.

I am still curious why the error happens even in simple mode as someone else seems to have the same issue. What are the dimensions of your expression matrix after filtering?

yunbokai commented 1 year ago

Hi @GeorgescuC,

Thanks for your advice. Here are my two matrix that will be used in the final infercnv::run()function:

dim(ST_Joined_Counts)

[1] 36601 4746

dim(FinalAnnotationsForExport)

[1] 4750 2

I have 4750 cells/spots and 36601 genes after filtering. I only filtered out the cells/splots which counts per spot <500.

GeorgescuC commented 1 year ago

Hi @yunbokai ,

If you reload the backup object from the last step that completed successfully, can you also check dim(infercnv_obj@expr.data)? This will indicate how many cells and genes are left after filtering. Alternatively, if you could share the data privately, I could look into the issue directly.

Regards, Christophe.