Clarification on PCs input data to generate cell types

livnatje / DIALOGUE

DIALOGUE is a dimensionality reduction method that uses cross-cell-type associations to identify multicellular programs (MCPs) and map the cell transcriptome as a function of its environment.

Other

106 stars 16 forks source link

Clarification on PCs input data to generate cell types #50

Open edg1983 opened 1 month ago

edg1983 commented 1 month ago

Hello,

Reading the documentation about cell type object generation, I understand that one should provide normalized counts as tpm and a table with principal components for each cell as X.

In the tutorial, it is reported "perform the initial dimensionality reduction when using only cells of a specific type or subtype to adequately capture the variation within that specific subset."

Does this mean that for each cell type, I need to re-compute PCs only for cells in that cell type? And in this case, do you suggest re-computing highly variable genes when doing that?

Thanks!

edg1983 commented 1 month ago

I have another related question.

I have also run integration on my data using SCVI. Thus, I have SCVI latent that I use instead of PCs for neighbor computation, for example.

Should I use these latents or original PCs as the X object in the DIALOGUE cell type? Or would it be better to use the original PCs or PCs re-computed for each cell type?

livnatje commented 2 weeks ago

You can try run it with SCVI output or any other feature space as the input. The second step of DIALOGUE will convert the MCPs to the gene level. There is a chance the SCVI and PC inputs will anyhow converge to the same MCPs, but this is not something we tested.