BayraktarLab / cell2location

Comprehensive mapping of tissue cell architecture via integrated single cell and spatial transcriptomics (cell2location model)
https://cell2location.readthedocs.io/en/latest/
Apache License 2.0
292 stars 54 forks source link

Best practices for multiple slides space mappings! #362

Closed Zhongzheng99 closed 1 month ago

Zhongzheng99 commented 2 months ago

hello, @vitkl . I have slides from three groups, and following previous tutorials (https://cell2location.readthedocs.io/en/latest/notebooks/cell2location_short_demo.html#1.-Loading-Visium-data), I've combined all the Visium data into one anndata object using concat, then estimated the posterior distribution of cell abundance. However, in the Scanpy official tutorial, they recommend analyzing the Visium data of each slide individually (https://scanpy.readthedocs.io/en/latest/tutorials/spatial/integration-scanorama.html#data-integration-and-label-transfer-from-scrna-seq-dataset). I'm curious about the best practice: should I integrate first and then perform spatial mapping?

vitkl commented 1 month ago

In general, we recommend using as many samples as fit into the GPU memory because it helps cell2location distinguish cell abundance from technical sensitivity effects. So concatenate anndata, then apply cell2location.

You can read our paper methods sections for a detailed explanation of the multi-sample model and the benefits it provides.

In some cases, such as when samples have very different sources (eg Visium fresh frozen vs Visium FFPE), samples of different source would have to be analysed separately.

Zhongzheng99 commented 1 month ago

Thank you for your reply!