BayraktarLab / cell2location

Comprehensive mapping of tissue cell architecture via integrated single cell and spatial transcriptomics (cell2location model)
https://cell2location.readthedocs.io/en/latest/
Apache License 2.0
324 stars 58 forks source link

Gene imputation #379

Open wangjiawen2013 opened 3 months ago

wangjiawen2013 commented 3 months ago

Hi, Can we perform gene imputation using cell2location ? (This is what can do with Tangram)

vitkl commented 3 months ago

We generally don't recommend doing gene imputation because you cannot add information to the data that isn't there ("False signals induced by single-cell imputation" Andrews et al https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6415334/). There are very few valid use cases. I would generally not use imputed counts for analysis - only for visualisation. You need to consider this carefully.

You could decompose spatial counts into likely contributions of several cell types - see tutorial.

You can compute imputation as follows:

# read per cell type averages for more genes and make sure that order of cell labels matches adata_vis.uns['mod']['factor_names']
more_genes_g_fg = pd.read_csv(csv_file, index_col='Unnamed: 0').reindex(columns=adata_vis.uns['mod']['factor_names'], fill_value=0.0).T

# export cell abundance
adata_vis.obs[adata_vis.uns['mod']['factor_names']] = adata_vis.obsm['q05_cell_abundance_w_sf']

# compute average per gene & location weighted by abundance of cell types
more_genes_x_sg = np.einsum('sf,fg->sg', adata_vis.obs[adata_vis.uns['mod']['factor_names']], more_genes_g_fg)