Open jenzopr opened 5 years ago
Hi Jens, Glad you have found scHPF useful! I think how you handle batch effects under the current scHPF model will be highly dataset dependent.
In general, I think what you propose sounds totally reasonable. Keep in mind though that many scRNA-seq experiments in human are completely batch confounded (ie on different genetic backgrounds, on people with different environmental exposures, etc.), so it can be hard to tell a technical "batch" effect from real biology. If you take the throw away approach, I think it's prudent just to check that the factor in question isn't also correlated with a covariate you care about (if possible).
Another approach we have taken is to apply scHPF separately across different background conditions, and cluster the resulting factors using the top and bottom genes to find conserved expression modules. See Figure 4, Extended Data Figure 4 and Methods in Szabo, Levitin et al, 2019.
Let me know if you have any other questions. Best, Hanna
Thanks a lot, Hanna, for your helpful elaboration on my proposed approach. Your correlation-based approach sounds very straight-forward and might be more suitable for what I'm after. I will give it a try!
Thank you for the great contribution! I'm currently trying scHPF on a few datasets and found that it was easy to apply.
I'd like to know how to properly deal with batch effects (mostly technical covariates), e.g. when input was processed using different plates. Do you have recommendations? A rather simple solution would be to identify those factors that correlate highly with the covariate of interest - and simply leave the factor out in e.g. further dimensionality reduction?
Best, Jens