Open cartal opened 3 years ago
Seems like this is a known issue in the source library: https://github.com/has2k1/scikit-misc/issues/9.
I don't think I could comment much more off the top of my head. @adamgayoso, do you have any suggestions here? Is trying to increase the span
fine?
I think the solution is to remove some of the most lowly expressed genes, though increasing span
may also work.
Could you suggest some error handling behavior here? I think there could definitely be a more helpful error message.
I also experienced this a few times, and took me some time to understand what is going on. I fully agree with @ivirshup, we should improve the error message.
In the end, increasing the span
to 1 fixed it for me. However, I'm still not sure why it happened.
I wish I understood why this was happening too. I believe it's the same underlying C code as the R implementation, and I don't think Seurat's code does anything special to prevent this.
Increasing the span could really affect which genes are selected as HVG I believe, whereas removing some outliers by low expression might not?
In my experience, this happens if batch key is not None and one or more batches have low number of cells. Does it make sense to catch this error and simply skip the problematic batch or inform the user that batch doesn't have enough cells?
@cartal Do you mind sharing the output of the following:
adata.obs.combined.value_counts()
I think the problem happen when some category in 'batch_key' only have one sample
I ran into this recently - the problem can occur when batch key has many cells in each batch (see plot). Increasing the span from the default of 0.3 to 0.5 seems to have "fixed" the error. Increasing the filtering stringency for lowly expressed genes (to min_gene=500, min_cells=10) also gets rid of the error.
sc.pp.filter_cells(adata, min_genes=200)
sc.pp.filter_genes(adata, min_cells=3)
sc.pp.highly_variable_genes(
adata,
layer="counts",
flavor="seurat_v3",
n_top_genes=num_hvgs,
batch_key='sex_cell_subtype',
span=0.5
)
Hi,
Trying to run
scVI
to analyse my data using the latestscanpy+scvi-tools
workflow, as described here.However, I'm running into a weird issue with the new
seurat_v3
flavour to call HVGs. When I run this:I get the following error:
While looking for a solution, I came across this issue that reports a similar problem.
Any ideas of what this may be?
Thanks
Versions