scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.88k stars 594 forks source link

sc.tl.score_genes -- ValueError: No valid genes were passed for scoring #3266

Open xyang2uchicago opened 3 days ago

xyang2uchicago commented 3 days ago

Please make sure these conditions are met

What happened?

I have found this issue in both old versions 1.9.3, 1.9.8 and the newest version 1.10.3. I am pretty sure my gene names are in the adata.var_names (see below code). I also confirm that each gene-set has 2 or more genes. Can anyone help to debug?

Thank you, Holly

Minimal code sample

print(sc.__version__)
# 1.10.3

# Randomly select 1000 cell indices
selected_cells = np.random.choice(adata.obs.index, size=1000, replace=False)
# Create a subset AnnData object
subset_adata = adata[selected_cells].copy()
subset_adata.write(save_fold + "subset_adata.h5ad")

#subset_known_markers = dict(list(filtered_known_markers.items())[:2])
tmp = ['Isl1', 'Tcf21', 'Tlx1'] 
[gene for gene in tmp if gene in subset_adata.var_names] == tmp  # True
tmp = ['Gata4', 'Nkx2-5', 'Nr2f2', 'Osr1', 'Tbx5', 'Wnt2'] 
[gene for gene in tmp if gene in subset_adata.var_names] == tmp  # True
subset_known_markers = {
    'Anterior cardiopharyngeal progenitors_Imaz2024': ['Isl1', 'Tcf21', 'Tlx1'], 
    'Cardiomyocytes FHF 1_Imaz2024': ['Gata4', 'Nkx2-5', 'Nr2f2', 'Osr1', 'Tbx5', 'Wnt2']
}

tmp = sc.tl.score_genes(subset_adata, gene_list= subset_known_markers, copy=True 
        #,use_raw=True 
        #,n_bins = 150 , ctrl_size =100
        )    # ctrl_size = 50 by default ; n_bins = 25 by default

Error output

WARNING: genes are not in var_names and ignored: ['Anterior cardiopharyngeal progenitors_Imaz2024', 'Cardiomyocytes FHF 1_Imaz2024']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/project/xyang2/anaconda/py38/lib/python3.8/site-packages/scanpy/tools/_score_genes.py", line 115, in score_genes
    raise ValueError("No valid genes were passed for scoring.")
ValueError: No valid genes were passed for scoring.

Versions

``` ----- anndata 0.9.2 scanpy 1.9.3 ----- PIL 10.4.0 cloudpickle 3.0.0 cycler 0.12.1 cython_runtime NA cytoolz 0.12.3 dask 2023.5.0 dateutil 2.9.0.post0 h5py 3.11.0 igraph 0.11.6 importlib_resources NA jinja2 3.1.4 joblib 1.4.2 kiwisolver 1.4.7 leidenalg 0.10.2 llvmlite 0.41.1 lz4 4.3.3 markupsafe 2.1.5 matplotlib 3.7.5 mpl_toolkits NA natsort 8.4.0 numba 0.58.1 numexpr 2.8.6 numpy 1.24.4 packaging 24.1 pandas 2.0.3 psutil 6.0.0 pyarrow 17.0.0 pyparsing 3.1.4 pytz 2024.2 scipy 1.10.1 session_info 1.0.0 six 1.16.0 sklearn 1.3.2 tblib 3.0.0 texttable 1.7.0 threadpoolctl 3.5.0 tlz 0.12.3 toolz 0.12.1 typing_extensions NA yaml 6.0.2 zipp NA ----- Python 3.8.19 | packaged by conda-forge | (default, Mar 20 2024, 12:47:35) [GCC 12.3.0] Linux-4.18.0-305.3.1.el8.x86_64-x86_64-with-glibc2.10 ----- Session information updated at 2024-09-26 13:44 ```