neurorestore / Libra

MIT License
150 stars 24 forks source link

Question about batch effect correction #56

Open 1226235 opened 4 months ago

1226235 commented 4 months ago

Hi,

Thanks for implementing all these methods in this easy-to-use package!

I have a question regarding how Libra deals with batch effects present in the single cell data. In the README, you suggest that "If batch effects are present in the data, these should be accounted for, e.g., using [Seurat] or [Harmony], to avoid biasing differential expression by technical differences or batch effects." From my understanding, Harmony only corrects for PC embeddings but not gen expression values. In this case, is it still the raw data (i.e. not corrected) that is inputed to Libra DE analysis? Seurat integration does correct gene expression values, but I've read that it is not safe to use corrected values for DE analysis since it violates the assumption that the measurements are independent from each other.

Can you please share some thoughts about this? Thank you very much!

AlanTeoYueYang commented 3 months ago

Hi, you are right. Libra does not inherently correct for any batch effect in the data. The batch effect affecting Libra analysis is the cell type annotation and that should be corrected for using Seurat/Harmony. We will update the README accordingly. Thank you for bringing this to our attention.

DarioS commented 3 months ago

If batch correction is stored in Seurat or SingleCellExperiment object, how will run_de know to use it instead of raw counts? There is no assay name parameter that the user can specify to indicate which data slot to use.