immunogenomics / harmony

Fast, sensitive and accurate integration of single-cell data with Harmony
https://portals.broadinstitute.org/harmony/
Other
517 stars 98 forks source link

about batch correction #127

Open tanasa opened 3 years ago

tanasa commented 3 years ago

Dear all,

i would appreciate your opinions, comments, advises on the following please :

<> we do have a two batches of scRNA-seq in CTRL (batch 1 and batch2), and two batches of scRNA-seq in STIM conditions (batch1 and batch2) (the batches were generated at an interval of 1-2 months between each other);

<> we had to integrate/combine those two batches of CTRL, and those two batches of STIM, in order to have a sufficient number of cells in each cluster, and to call conserved and differential markers;

the question would be : when we do batch correction with Harmony, or with other algorithms, would these algorithms correct also the fact that the two batches of CTRL and the two batches of STIM were produced at different time points and integrated in a CTRL matrix and in a STIM matrix ?

thanks a lot,

-- bogdan

hahia commented 3 years ago

Dear all,

i would appreciate your opinions, comments, advises on the following please :

<> we do have a two batches of scRNA-seq in CTRL (batch 1 and batch2), and two batches of scRNA-seq in STIM conditions (batch1 and batch2) (the batches were generated at an interval of 1-2 months between each other);

<> we had to integrate/combine those two batches of CTRL, and those two batches of STIM, in order to have a sufficient number of cells in each cluster, and to call conserved and differential markers;

the question would be : when we do batch correction with Harmony, or with other algorithms, would these algorithms correct also the fact that the two batches of CTRL and the two batches of STIM were produced at different time points and integrated in a CTRL matrix and in a STIM matrix ?

thanks a lot,

-- bogdan

this is also the question I want to ask. Hope someone can give some suggestions!!!!!!!!

cswoboda commented 3 years ago

Hi both, not the package maintainer but this shouldnt be an issue for this type of integration. You start with the raw count matrices before integration. Truly, I find it pretty likely that if you run harmony on just the four count matrices, grouping by just your dataset alone you should be okay. However, you can opt to do a multivariate integration if you so choose, where you control for two conditions at one time. Here's an example for your needs:

dataset <- RunHarmony(dataset, group.by.vars = c("sample", "condition"))

this would control for individual batch effects (sample = maintain integration for batch. diversity, condition = ctrl vs stim).

Truthfully, you're just feeding in raw cell and matrix counts so it shouldn't matter too much.

tanasa commented 3 years ago

thanks a lot for your time, comments and suggestions :)