Feasibility of projecting Mutant samples onto cell space computed on WT samples

Hi,

I have a methodological question about LIGER. Does it make sense to first align two datasets on their WT samples, and then project the Mutant samples onto this alignment (following Scenario 3 on this tutorial)? There is a step that makes me think this approach will fail to capture signal coming from Mutant, since, before alignment, it is required to select top variable genes with (see #222 for more context):

wt.ldata <- selectGenes(wt.ldata)
mut.ldata@var.genes <- wt.ldata@var.genes

By design, genes that are very variable between Mutant and WT, might not be very informative within the WT samples. Therefore, they won't be selected for iNMF and further projection of Mutant samples on the same space.

I think Scenario 3 is indicated when you are aggregating different datasets, but the one on which iNMF is computed should contain in itself all the expected variation (different conditions, different cell types).

Is it correct according to the LIGER methodology?

Thanks, Francesco

welch-lab / liger

Feasibility of projecting Mutant samples onto cell space computed on WT samples #223