Open Thapeachydude opened 9 months ago
Seems pretty reasonable to me. The sparsity wouldn't even matter in fastMNN
, which typically operates on the PC space anyway. The only thing to keep in mind is that bulk datasets generally have fewer samples, so the default choices of k
(the number of neighbors used to find MNNs) may not be appropriate.
I suppose the other reason that we don't use this class of batch correction methods for bulk data is that the output is not fit for DE analyses. (In fact, you could say that about any batch correction method.) So it's fine and all for exploratory analysis, a bit of clustering, visualization, etc. but if you plan on doing some DE, you'd want to get the raw counts.
Hi thanks a lot for the quick reply and the feedback!
Hi,
great package. I was wondering if this form of batch integration is also applicable to bulk RNA-Seq data. Sure the data is less sparse, but would that be an issue?
Happy about any feedback! Best, M