Closed kanefos closed 2 years ago
Hi! Thanks for the questions and comment!
Yes, you are absolutely right, DIALOGUE's first step does not include batch correction, and therefore if there are substantial batch-effects in the data it is recommended to provide the embedding (e.g., PCs) post-correction. In other words, the X
input for each cell.type
will be your batch-corrected embedding.
As you mentioned, DIALOGUE's second step is based on mixed-effect models, and can account for batches if the batch is provided as a confounder. You can also provide the batch-corrected gene expression and instead of having the batch as a confounder.
As a general note, I assume the batches are not identical to samples in your data, but if that's the case then it would be difficult to distinguish between batch-effects and cross-sample biological variation, irrespective of which method you use.
Hope this helps and let us know if you have any additional questions.
Hey,
Thanks for such an informative reply. Follow what you mean completely. I will try submitting a batch-correct X
and include my batch in conf
and let you know if anything trick pops up. Thanks again!
Hi authors,
Loved the manuscript and concept. Can really see the potential for DIALOGUE to analyse big atlases of many hundreds of samples.
This in mind, have you discussed / tested DIALOGUE for use on batch-corrected data? I am specifically interested in if you think a batch-corrected latent representation (i.e. as output by scVI) could serve as input for the original feature space alongside uncorrected UMI counts for the
tpm
input. Alternatively, is normalised expressed suitable here? Of course, there is also the Seurat output (normalised UMI counts).Would the hierarchical models mentioned in para 1 of methods would be able to to account for these specific contexts? Any ideas would be fantastic, thanks a lot