morris-lab / Capybara

Capybara: A computational tool to measure cell identity and fate transitions
56 stars 9 forks source link

Batch effect #2

Closed petrsh closed 4 years ago

petrsh commented 4 years ago

Hi, I was wondering whether a batch effect can be an issue when constructing a reference from datasets that don't share any cell types and therefore to the best of my knowledge it's impossible to distinguish between biological and technical variation. When classifying cells from different datasets a batch effect shouldn't be an issue, right? But when constructing a reference I'm not sure and I'd like to know your opinion. Thanks a lot! Petr

KaetheKong commented 4 years ago

Hi Petr,

Batch effect could be a potential issue in the classification process. Because the reference construction involves each cell type individually, the construction should not be affected since the two datasets don't share any cell types, which in a way, the batch effect would be maintained in the reference. Then in the pipeline, we do perform a naive way of relieving this issue by normalization, which is a reason why we recommend using raw data. Via this preprocessing, the batch effect should hopefully not affect the downstream classification. We've also performed such reference construction in the last part of the manuscript regarding iEP reprogramming, where we put together a dataset from inDrop together with Mouse Cell Atlas. And we believe the result made sense and we also performed some negative control experiments, which did classify to the correct cell type within the mixed reference.

Hope this is helpful! Best Wenjun