broadinstitute / CellBender

CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
https://cellbender.rtfd.io
BSD 3-Clause "New" or "Revised" License
285 stars 52 forks source link

Coupling CellBender w/ other preprocessing tools--doublet detection, demultiplexing #381

Open lhoranportelance opened 3 weeks ago

lhoranportelance commented 3 weeks ago

Hello! Thank you for creating this great tool. I have gotten CellBender to run successfully on my data. Now, I'm wondering what your recommended pipeline would look like for integrating the outputs with those from other preprocessing softwares. Basically, I'm not sure what is your recommended order of operations. I am running Demuxafy with several doublet detection and demultiplexing tools and CellBender. There are several options I see as being possible to integrate these--which (if any) would you recommend?

1) Run Demuxafy on filtered CellRanger output, get doublet barcodes. Run CellBender on raw CellRanger output, get the adjusted counts matrix, and filter down to only non-doublet barcodes identified by Demuxafy. (this is what I am leaning towards) 2) Run CellBender on raw CellRanger output, then use the filtered CellBender .h5 as an input to Demuxafy for doublet detection and demultiplexing. (my worry with this approach is that 1) certain softwares may not accept the CellBender .h5 given the changes with v0.3, and 2) that using the adjusted counts matrix may mess with demultiplexing and doublet detection).

If there's an approach you already use, please let me know--otherwise, any advice is welcome!

onurcanbektas commented 2 weeks ago

I'm also curious about this, but intuitively, I would expect running Demuxafy on the cellbender's output would make more sense. Afterall a droplet can be classified as doublet if a) it has a lot of background (false-positive) b) it is a true doublet. By running Demuxafy on the backround-removed data would give more statistical power, I think.