broadinstitute / CellBender

CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
https://cellbender.rtfd.io
BSD 3-Clause "New" or "Revised" License
294 stars 54 forks source link

Unsure how to interpret cell probabilities #229

Open egatlas opened 1 year ago

egatlas commented 1 year ago

I have just started using CellBender and came upon a result I don't know how to interpret. Apologies if someone has already posted a similar looking plot. My training looks okay as far as I can tell, but I don't see any barcodes with a probability of 0. I am currently trying to run again with 50,000 droplets included, but not sure why I am seeing this result. Thanks in advance for your help!

Screenshot 2023-07-15 at 10 58 55 AM
egatlas commented 1 year ago

Here's what the output looks like with 50,000 barcodes. Does this look good or are there ways I should adjust the training? Thanks!

Screenshot 2023-07-15 at 12 08 29 PM
sjfleming commented 1 year ago

Hi @egatlas , sorry for the delay. It looks like you have a dataset here with a whole lot of ambient RNA. It looks like the empty droplets have like 2000 UMI counts. Do you agree?

It might be the case that cellbender is just a little bit uneasy assigning a cell probability of 0 to a droplet with 2000 UMI counts. But the run results are probably totally fine.

One question though: in the log file (first 20 or so lines), where it says "counts in empty droplets" and gives a number... what is that number? I want to make sure cellbender also thinks the empty droplets have about 1000 or 2000 UMI counts. If so, the result seems fine. For the purposes of constructing an output, any droplet with < 0.5 cell probability is considered "empty" by cellbender. So it doesn't matter if the probability fully goes to zero. As long as it's < 0.5, it will be considered an empty droplet when the output is computed.

Is it possible to see the full log-log UMI curve for this sample?