bioinfo-biols / SEVtras

sEV-containing droplet identification in scRNA-seq data (SEVtras)
GNU Affero General Public License v3.0
17 stars 4 forks source link

error in SEVtras.ESAI_calculator when constructing adata_cell.raw #19

Closed Jinqingchang closed 2 months ago

Jinqingchang commented 2 months ago

Dear developer: Due to the requirement of SEVtras to save the adata asadata.raw before any filtering and normalization steps, which differs from the standard Scanpy workflow that saves adata as adata.raw after quality control filtering and log transformation, I reserved an identical and untreated copy of the object, adata2(n_obs × n_vars = 298308 × 32285), at the time of object creation. After completing the standard analysis on adata1 with Scanpy(n_obs × n_vars = 232884 × 25113)and obtaining the cell types, I attempted to use adata.raw = adata2.copy() and obtained an adata.raw.shape of (232884, 32285), which appears to have filtered out cells without filtering any genes, seemingly achieving the desired outcome. However, when I ran the ESAI_calculator, I encountered the following error:

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/SEVtras/main.py:188, in ESAI_calculator(adata_ev_path, adata_cell_path, out_path, species, OBSsample, OBScelltype, OBSev, OBSMpca, cellN, Xraw, normalW, plot_cmp, save_plot_prefix, OBSMumap, size) 186 adata_cell = read_adata(adata_cell_path, get_only=False) 187 from .functional import deconvolver, ESAI_celltype, plot_SEVumap, plot_ESAIumap --> 188 celltype_e_number, adata_evS, adata_com = deconvolver(adata_ev, adata_cell, species, OBSsample, OBScelltype, OBSev, OBSMpca, cellN, Xraw, normalW) 189 ##ESAI for sample 190 sample_ESAI = (adata_com[adata_com.obs[OBScelltype]==OBSev,].obs[OBSsample].value_counts() / adata_com[adata_com.obs[OBScelltype]!=OBSev,].obs[OBSsample].value_counts()).fillna(0)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/SEVtras/functional.py:117, in deconvolver(adata_ev, adata_cell, species, OBSsample, OBScelltype, OBSev, OBSMpca, cellN, Xraw, normalW) 115 def deconvolver(adata_ev, adata_cell, species, OBSsample='batch', OBScelltype='celltype', OBSev='sEV', OBSMpca='X_pca', cellN=10, Xraw = True, normalW=True): --> 117 adata_combined = preprocess_source(adata_ev, adata_cell, OBScelltype=OBScelltype, OBSev=OBSev, Xraw = Xraw) 118 gsea_pval_dat = source_biogenesis(adata_cell, species, OBScelltype=OBScelltype, Xraw = Xraw, normalW=normalW) 119 near_neighbor_dat = near_neighbor(adata_combined, OBSsample=OBSsample, OBSev=OBSev, OBScelltype=OBScelltype, OBSMpca=OBSMpca, cellN=cellN)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/SEVtras/functional.py:74, in preprocess_source(adata_ev, adata_cell, OBScelltype, OBSev, Xraw) 71 def preprocess_source(adata_ev, adata_cell, OBScelltype='celltype', OBSev='sEV', Xraw = True): 72 ## cell type 73 if Xraw: ---> 74 adata_cell_raw = copy.copy(adata_cell.raw.to_adata()) 75 else: 76 adata_cell_raw = copy.copy(adata_cell)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/anndata/_core/raw.py:159, in Raw.to_adata(self) 156 """Create full AnnData object.""" 157 from anndata import AnnData --> 159 return AnnData( 160 X=self.X.copy(), 161 var=self.var.copy(), 162 varm=None if self._varm is None else self._varm.copy(), 163 obs=self._adata.obs.copy(), 164 obsm=self._adata.obsm.copy(), 165 obsp=self._adata.obsp.copy(), 166 uns=self._adata.uns.copy(), 167 )

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/anndata/_core/anndata.py:271, in AnnData.init(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, obsp, varp, oidx, vidx) 269 self._init_as_view(X, oidx, vidx) 270 else: --> 271 self._init_as_actual( 272 X=X, 273 obs=obs, 274 var=var, 275 uns=uns, 276 obsm=obsm, 277 varm=varm, 278 raw=raw, 279 layers=layers, 280 dtype=dtype, 281 shape=shape, 282 obsp=obsp, 283 varp=varp, 284 filename=filename, 285 filemode=filemode, 286 )

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/anndata/_core/anndata.py:453, in AnnData._init_as_actual(self, X, obs, var, uns, obsm, varm, varp, obsp, raw, layers, dtype, shape, filename, filemode) 450 source = "shape" 452 # annotations --> 453 self._obs = _gen_dataframe( 454 obs, ["obs_names", "row_names"], source=source, attr="obs", length=n_obs 455 ) 456 self._var = _gen_dataframe( 457 var, ["var_names", "col_names"], source=source, attr="var", length=n_vars 458 ) 460 # now we can verify if indices match!

File ~/miniconda3/envs/scanpy/lib/python3.10/functools.py:889, in singledispatch..wrapper(*args, *kw) 885 if not args: 886 raise TypeError(f'{funcname} requires at least ' 887 '1 positional argument') --> 889 return dispatch(args[0].class)(args, **kw)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/anndata/_core/aligned_df.py:64, in _gen_dataframe_df(anno, index_names, source, attr, length) 54 @_gen_dataframe.register(pd.DataFrame) 55 def _gen_dataframe_df( 56 anno: pd.DataFrame, (...) 61 length: int | None = None, 62 ): 63 if length is not None and length != len(anno): ---> 64 raise _mk_df_error(source, attr, length, len(anno)) 65 anno = anno.copy(deep=False) 66 if not is_string_dtype(anno.index):

ValueError: Observations annot. obs must have as many rows as X has rows (298308), but has 232884 rows.

Perhaps my understanding of the Scanpy data structure is not sufficient, so I may not have correctly createdadata.raw. How should I obtain adata_cell that is annotated with cell types? According to the error message, should I not filter any cells or genes at all? I am looking forward to you providing some assistance in resolving the current issue.

Jinqingchang commented 2 months ago

Sorry, just saw your upgrated tutorials! It has been fixed!

RuiqiaoHe commented 2 months ago

Thanks for your testing~