satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.27k stars 910 forks source link

How to obtain the same clusters as when performing clustering on RNA GEX vs on loom's "spliced" in RNA Velocity analysis? #9115

Closed denvercal1234GitHub closed 3 months ago

denvercal1234GitHub commented 3 months ago

In the tutorial for RNA velocity analysis (http://htmlpreview.github.io/?https://github.com/satijalab/seurat-wrappers/blob/master/docs/scvelo.html), which was what @mojaveazure in issue https://github.com/satijalab/seurat/issues/3423 advised to do, it performs clustering on "spliced" counts of the loom objects.

Thus, I tried to subset the same cells and genes from the loom objects matching with the cells and genes from my GEX Seurat object and perform the same clustering workflow. My aim is to have the same clusters by performing clustering workflow on the "spliced" counts from loom objects; however, instead I obtained different clusters compared to those I got from performing the same clustering workflow on GEX "RNA".

QUESTION 1. How should I process loom objects so that we can directly connect the trajectories onto the original clusters of the GEX object?

QUESTION 2. Do we expect the clusters generated by GEX (i.e., cellranger count outputs) and the clusters generated by setting bm[["RNA"]] <- bm[["spliced"]] (i.e., performing clustering on "spliced" counts) to be the same/similar?

Thank you for your help!

Related to https://github.com/satijalab/seurat/issues/6863, https://github.com/satijalab/seurat/issues/6869 and https://github.com/satijalab/seurat/issues/5318

[Reposted of https://github.com/satijalab/seurat/issues/6959 as advised]

rsatija commented 3 months ago

Thanks for reposting here. The spliced and total gene expression matrices are similar but not identical, so the ensuing clusters should also be similar, but not identical.

You can certainly perform clustering on the full geX matrix and use those to set the identities of the spliced object. Assuming the two objects have the same cell names:

clusters_full = object_full$seurat_annotations
object_spliced$clusters_full = object_full$seurat_annotations
Idents(object_spliced) <- 'clusters_full'