welch-lab / liger

R package for integrating and analyzing multiple single-cell datasets
GNU General Public License v3.0
381 stars 78 forks source link

Feasibility of projecting Mutant samples onto cell space computed on WT samples #223

Closed fbrundu closed 3 years ago

fbrundu commented 3 years ago

Hi,

I have a methodological question about LIGER. Does it make sense to first align two datasets on their WT samples, and then project the Mutant samples onto this alignment (following Scenario 3 on this tutorial)? There is a step that makes me think this approach will fail to capture signal coming from Mutant, since, before alignment, it is required to select top variable genes with (see #222 for more context):

wt.ldata <- selectGenes(wt.ldata)
mut.ldata@var.genes <- wt.ldata@var.genes

By design, genes that are very variable between Mutant and WT, might not be very informative within the WT samples. Therefore, they won't be selected for iNMF and further projection of Mutant samples on the same space.

I think Scenario 3 is indicated when you are aggregating different datasets, but the one on which iNMF is computed should contain in itself all the expected variation (different conditions, different cell types).

Is it correct according to the LIGER methodology?

Thanks, Francesco

cgao90 commented 3 years ago

Hi Francesco,

It's a good point! Scenario 3 would be most useful for projecting small and specialized samples onto a large and comprehensive atlas. In your case, assuming the Mutant have a few cell types not contained in the WT sample, the matrix projection here should be able to detect those cells. However, due to the lack of information (metagenes learned on WT samples) about those extra cell types, Scenario 3 may fail to distinguish between the non-aligning cells (from Mutant samples).