I get a KeyError when running popv with prediction_mode set to fast.
If needed I can provide links to public h5ad I've used (HuBMAP)
Solution
I suspect the problem is this line:
adata.obs["onclass_seen"] = pred_label_str in _onclass.py which should be changed to
adata.obs[self.seen_result_key] = pred_label_str like in the non-fast prediction mode path
Stacktrace
File "/usr/local/lib/python3.11/dist-packages/popv/annotation.py", line 75, in annotate_data
compute_consensus(adata, all_prediction_keys_seen)
File "/usr/local/lib/python3.11/dist-packages/popv/annotation.py", line 119, in compute_consensus
consensus_prediction = adata.obs[prediction_keys].apply(_utils.majority_vote, axis=1)
~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/pandas/core/frame.py", line 3813, in __getitem__
indexer = self.columns._get_indexer_strict(key, "columns")[1]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/pandas/core/indexes/base.py", line 6070, in _get_indexer_strict
self._raise_if_missing(keyarr, indexer, axis_name)
File "/usr/local/lib/python3.11/dist-packages/pandas/core/indexes/base.py", line 6133, in _raise_if_missing
raise KeyError(f"{not_found} not in index")
Version information
popv 0.4.2
session_info 1.0.0
97d2165b493fecec79c65b5c6254dffd4e375528 NA
OnClass NA
PIL 10.2.0
absl NA
anndata 0.10.6
annoy NA
astunparse 1.6.3
attr 23.2.0
celltypist 1.6.2
certifi 2024.02.02
charset_normalizer 3.3.2
chex 0.1.85
contextlib2 NA
cycler 0.12.1
cython_runtime NA
dateutil 2.9.0.post0
docrep 0.3.2
etils 1.7.0
fbpca NA
filelock 3.13.1
flatbuffers 24.3.7
flax 0.8.1
fontTools 4.49.0
fsspec 2024.2.0
gast 0.5.4
google NA
h5py 3.10.0
harmony 0.1.8
huggingface_hub 0.21.4
idna 3.6
igraph 0.11.4
importlib_resources NA
intervaltree NA
jax 0.4.25
jaxlib 0.4.25
joblib 1.3.2
keras 3.0.5
kiwisolver 1.4.5
leidenalg 0.10.2
lightning 2.1.4
lightning_utilities 0.10.1
llvmlite 0.42.0
matplotlib 3.8.3
ml_collections NA
ml_dtypes 0.3.2
mpl_toolkits NA
msgpack 1.0.8
mudata 0.2.3
multipledispatch 0.6.0
namex NA
natsort 8.4.0
networkx 3.2.1
numba 0.59.0
numpy 1.26.4
numpyro 0.14.0
obonet 1.0.0
opt_einsum v3.3.0
optax 0.2.1
packaging 24.0
pandas 1.5.3
patsy 0.5.6
pkg_resources NA
psutil 5.9.8
pybind11_abseil NA
pygments 2.17.2
pynndescent 0.5.11
pyparsing 3.1.2
pyro 1.9.0
pytz 2024.1
regex 2.5.140
requests 2.31.0
rich NA
scanorama 1.7.4
scanpy 1.9.8
scipy 1.12.0
scvi 1.1.2
seaborn 0.13.2
setuptools 69.1.1
sitecustomize NA
six 1.16.0
sklearn 1.1.3
socks 1.7.1
sortedcontainers 2.4.0
statsmodels 0.14.1
tensorflow 2.16.1
termcolor NA
texttable 1.7.0
threadpoolctl 3.3.0
tokenizers 0.15.2
toolz 0.12.1
torch 2.2.1+cu121
torchgen NA
torchmetrics 1.3.1
tqdm 4.66.2
transformers 4.38.2
tree 0.1.8
typing_extensions NA
urllib3 2.2.1
wrapt 1.16.0
yaml 6.0.1
zoneinfo NA
Python 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0]
Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Report
Problem Description
I get a KeyError when running popv with prediction_mode set to fast. If needed I can provide links to public h5ad I've used (HuBMAP)
Solution
I suspect the problem is this line:
adata.obs["onclass_seen"] = pred_label_str
in _onclass.py which should be changed toadata.obs[self.seen_result_key] = pred_label_str
like in the non-fast prediction mode pathStacktrace
Version information
popv 0.4.2 session_info 1.0.0
97d2165b493fecec79c65b5c6254dffd4e375528 NA OnClass NA PIL 10.2.0 absl NA anndata 0.10.6 annoy NA astunparse 1.6.3 attr 23.2.0 celltypist 1.6.2 certifi 2024.02.02 charset_normalizer 3.3.2 chex 0.1.85 contextlib2 NA cycler 0.12.1 cython_runtime NA dateutil 2.9.0.post0 docrep 0.3.2 etils 1.7.0 fbpca NA filelock 3.13.1 flatbuffers 24.3.7 flax 0.8.1 fontTools 4.49.0 fsspec 2024.2.0 gast 0.5.4 google NA h5py 3.10.0 harmony 0.1.8 huggingface_hub 0.21.4 idna 3.6 igraph 0.11.4 importlib_resources NA intervaltree NA jax 0.4.25 jaxlib 0.4.25 joblib 1.3.2 keras 3.0.5 kiwisolver 1.4.5 leidenalg 0.10.2 lightning 2.1.4 lightning_utilities 0.10.1 llvmlite 0.42.0 matplotlib 3.8.3 ml_collections NA ml_dtypes 0.3.2 mpl_toolkits NA msgpack 1.0.8 mudata 0.2.3 multipledispatch 0.6.0 namex NA natsort 8.4.0 networkx 3.2.1 numba 0.59.0 numpy 1.26.4 numpyro 0.14.0 obonet 1.0.0 opt_einsum v3.3.0 optax 0.2.1 packaging 24.0 pandas 1.5.3 patsy 0.5.6 pkg_resources NA psutil 5.9.8 pybind11_abseil NA pygments 2.17.2 pynndescent 0.5.11 pyparsing 3.1.2 pyro 1.9.0 pytz 2024.1 regex 2.5.140 requests 2.31.0 rich NA scanorama 1.7.4 scanpy 1.9.8 scipy 1.12.0 scvi 1.1.2 seaborn 0.13.2 setuptools 69.1.1 sitecustomize NA six 1.16.0 sklearn 1.1.3 socks 1.7.1 sortedcontainers 2.4.0 statsmodels 0.14.1 tensorflow 2.16.1 termcolor NA texttable 1.7.0 threadpoolctl 3.3.0 tokenizers 0.15.2 toolz 0.12.1 torch 2.2.1+cu121 torchgen NA torchmetrics 1.3.1 tqdm 4.66.2 transformers 4.38.2 tree 0.1.8 typing_extensions NA urllib3 2.2.1 wrapt 1.16.0 yaml 6.0.1 zoneinfo NA
Python 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0] Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Session information updated at 2024-03-25 20:15