satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.24k stars 901 forks source link

Error merging index data frame with map #6902

Closed coolmak32 closed 1 year ago

coolmak32 commented 1 year ago

Hello, I am following the tutorial posted by Basilkhuder on doing RNA velocity estimation on Seurat object (https://github.com/basilkhuder/Seurat-to-RNA-Velocity)

I am having an error where the command is given to merge the index data frame with UMAP to match the order pf anndata which is generating an error. Please see the attached output:

import anndata import scvelo as scv import pandas as pd import numpy as np import matplotlib as plt sample_one = anndata.read_loom("cellRanger.loom") sample_obs = pd.read_csv("cellID_obs.csv") umap_cord = pd.read_csv("cell_embeddings.csv") cell_clusters = pd.read_csv("clusters.csv") sample_one = sample_one[np.isin(sample_one.obs.index,sample_obs["x"])] umap = pd.read_csv("cell_embeddings.csv") sample_one.obs.index sample_one_index = pd.DataFrame(sample_one.obs.index) sample_one_index = sample_one_index.rename(columns = {0:'Cell ID'}) umap = umap.rename(columns = {'Unnamed: 0':'Cell ID'}) umap_ordered = sample_one_index.merge(umap, on = "Cell ID")

KeyError Traceback (most recent call last) Cell In[16], line 1 ----> 1 umap_ordered = sample_one_index.merge(umap, on = "Cell ID")

File ~/Documents/anaconda3/envs/ScveloR/lib/python3.10/site-packages/pandas/core/frame.py:10093, in DataFrame.merge(self, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate) 10074 @substitution("") 10075 @appender(_merge_doc, indents=2) 10076 def merge( (...) 10089 validate: str | None = None, 10090 ) -> DataFrame: 10091 from pandas.core.reshape.merge import merge

10093 return merge( 10094 self, 10095 right, 10096 how=how, 10097 on=on, 10098 left_on=left_on, 10099 right_on=right_on, 10100 left_index=left_index, 10101 right_index=right_index, 10102 sort=sort, 10103 suffixes=suffixes, 10104 copy=copy, 10105 indicator=indicator, 10106 validate=validate, 10107 ) File ~/Documents/anaconda3/envs/ScveloR/lib/python3.10/site-packages/pandas/core/reshape/merge.py:110, in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate) 93 @substitution("\nleft : DataFrame or named Series") 94 @appender(_merge_doc, indents=0) 95 def merge( (...) 108 validate: str | None = None, 109 ) -> DataFrame: --> 110 op = _MergeOperation( 111 left, 112 right, 113 how=how, 114 on=on, 115 left_on=left_on, 116 right_on=right_on, 117 left_index=left_index, 118 right_index=right_index, 119 sort=sort, 120 suffixes=suffixes, 121 indicator=indicator, 122 validate=validate, 123 ) 124 return op.get_result(copy=copy)

File ~/Documents/anaconda3/envs/ScveloR/lib/python3.10/site-packages/pandas/core/reshape/merge.py:703, in _MergeOperation.init(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, indicator, validate) 696 self._cross = cross_col 698 # note this function has side effects 699 ( 700 self.left_join_keys, 701 self.right_join_keys, 702 self.join_names, --> 703 ) = self._get_merge_keys() 705 # validate the merge keys dtypes. We may need to coerce 706 # to avoid incompatible dtypes 707 self._maybe_coerce_merge_keys()

File ~/Documents/anaconda3/envs/ScveloR/lib/python3.10/site-packages/pandas/core/reshape/merge.py:1179, in _MergeOperation._get_merge_keys(self) 1175 if lk is not None: 1176 # Then we're either Hashable or a wrong-length arraylike, 1177 # the latter of which will raise 1178 lk = cast(Hashable, lk) -> 1179 left_keys.append(left._get_label_or_level_values(lk)) 1180 join_names.append(lk) 1181 else: 1182 # work-around for merge_asof(left_index=True)

File ~/Documents/anaconda3/envs/ScveloR/lib/python3.10/site-packages/pandas/core/generic.py:1850, in NDFrame._get_label_or_level_values(self, key, axis) 1844 values = ( 1845 self.axes[axis] 1846 .get_level_values(key) # type: ignore[assignment] 1847 ._values 1848 ) 1849 else: -> 1850 raise KeyError(key) 1852 # Check for duplicates 1853 if values.ndim > 1:

KeyError: 'Cell ID'

Can somebody please help me solving this error. I am only having a beginner's level expertise so please excuse if there is a very basic workaround for this issue.

Thank. you

saketkc commented 1 year ago

This is not a Seurat issue. But it seems Cell ID column is missing from your metadata (which the code you kinked expects).