theislab / kBET

An R package to test for batch effects in high-dimensional single-cell RNA sequencing data.
Apache License 2.0
154 stars 23 forks source link

kBET documentation for batch argument #63

Closed edward130603 closed 3 years ago

edward130603 commented 3 years ago

Hello, I was wondering if you could clarify the documentation for the batch argument to the kBET function. Currently, it says "batch id for each cell or a data frame with both condition and replicates". Does the data frame option allow you to specify multiple replicates for some biological condition (i.e. treatment vs. control)? If so, could you provide an example of how to format this data frame?

mbuttner commented 3 years ago

Hi @edward130603 thank you for your comment. The batch argument is currently an array with length number of cells and will be treated as a categorical variable (i.e. as a factor). Each level is treated equally, so there is no nested batch design considered. If you want to test the batch effect of a nested study design, I suggest to split the data by condition (e.g. use only the treatment samples) and then test for batch effects separately. Otherwise, when you use all samples from different conditions, you will observe confounding of batch effects and the biological signal (from the different conditions). I hope that helps!

edward130603 commented 3 years ago

Very helpful. Thanks for clarifying!