ebi-gene-expression-group / scanpy-scripts

Scripts for using scanpy
Apache License 2.0
30 stars 13 forks source link

Batch aware Scrublet #105

Closed pinin4fjords closed 3 years ago

pinin4fjords commented 3 years ago

The Scrublet documentation states that we shouldn't be doing single run for multi-batch data. So this PR adds code to run any batches separately and combine the results. Filtering can then be done on the merged doublet calls.

The batch-wise slices have to be copied rather than just being views, but I've tried to minimise the impact on memory by only saving the stats we need from each one.

This should probably be pushed to the Scanpy codebase where the Scrublet wrapper lives, and I'll do that, but we need this fairly quickly and I don't want to wait for a Scanpy release.