welch-lab / MultiVelo

Multi-omic velocity inference
BSD 3-Clause "New" or "Revised" License
105 stars 12 forks source link

"TypeError...invalid key" during mv.recover_dynamics_chrom() #13

Closed anderswe closed 1 year ago

anderswe commented 1 year ago

Hi MultiVelo team,

Super grateful for your work. Excited to run MultiVelo on some in-house 10X multiome data.

I am running into the following error working from the terminal / HPC, and would appreciate your help.

I'm mostly an R user - not sure how to go about debugging this. The error—and similar issues on Stack Overflow—make me think there's a numpy array / pandas dataframe mixup somewhere.

Thanks!

>>> adata_result = mv.recover_dynamics_chrom(rna,
...                                          atac,
...                                          gene_list = hvg,
...                                          max_iter=5,
...                                          init_mode="invert",
...                                          verbose=True,
...                                          parallel=False,
...                                          save_plot=False,
...                                          rna_only=False,
...                                          fit=True,
...                                          n_anchors=500)

1614 genes will be fitted
  0%|                                                | 0/1614 [00:00<?, ?it/s]@@@@@fitting GALNTL6
Traceback (most recent call last):
  File "/hpf/largeprojects/anders/multiome/src/envs/multivelo/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3800, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 144, in pandas._libs.index.IndexEngine.get_loc
TypeError: '(array([False, False, False, ..., False, False, False]), slice(None, None, None))' is an invalid key

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/hpf/largeprojects/anders/multiome/src/envs/multivelo/lib/python3.10/site-packages/multivelo/dynamical_chrom_func.py", line 2727, in recover_dynamics_chrom
    time, state, velocity, likelihood, anchors) = func_to_call(c_mat[:,i], 
  File "/hpf/largeprojects/anders/multiome/src/envs/multivelo/lib/python3.10/site-packages/multivelo/dynamical_chrom_func.py", line 2127, in regress_func
    cdc = ChromatinDynamical(c, 
  File "/hpf/largeprojects/anders/multiome/src/envs/multivelo/lib/python3.10/site-packages/multivelo/dynamical_chrom_func.py", line 871, in __init__
    self.check_partial_trajectory(determine_model=determine_model)
  File "/hpf/largeprojects/anders/multiome/src/envs/multivelo/lib/python3.10/site-packages/multivelo/dynamical_chrom_func.py", line 920, in check_partial_trajectory
    pdist = pairwise_distances(self.embed_coord[w_non_zero,:][w_low,:])
  File "/hpf/largeprojects/anders/multiome/src/envs/multivelo/lib/python3.10/site-packages/pandas/core/frame.py", line 3805, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/hpf/largeprojects/anders/multiome/src/envs/multivelo/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3807, in get_loc
    self._check_indexing_error(key)
  File "/hpf/largeprojects/anders/multiome/src/envs/multivelo/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 5963, in _check_indexing_error
    raise InvalidIndexError(key)
pandas.errors.InvalidIndexError: (array([False, False, False, ..., False, False, False]), slice(None, None, None))
danielee0707 commented 1 year ago

Hi, thank you for using MultiVelo. I'm not familiar with this error. It's strange that Pandas has something to do with the embed_coord object since it should usually be a NumPy array. Can you make sure X_umap is present in the rna.obsm field. You can try print it out with print(rna.obsm["X_umap"]) and see if it is just a NumPy array. Also, which version of MultiVelo are you using? You can check with print(mv.__version__). Thanks.

anderswe commented 1 year ago

That fixed it, thank you!

For anyone else who runs into this issue as well, it's because I imported UMAP embeddings from a Seurat/Signac-processed object into a Pandas DataFrame...then assigned that DataFrame directly to rna.obsm["X_umap"] instead of converting to a NumPy array with rna.obsm["X_umap"] = rna.obsm["X_umap"].to_numpy() first.