Open aodainic7 opened 1 year ago
Hi @aodainic7 , I think I have a suggestion that can help you!
So right now, cellbender is identifying what I believe to be cells AND empty droplets as "cells". I think those regions on the UMI curve with ~300 counts (like batch3_2 from droplet 30k to droplet 100k) are the empty droplets. So you have several hundred counts of ambient RNA in empty droplets, and cellbender can probably help out a lot!
But currently cellbender is not identifying the empty droplets correctly. This can probably be fixed by changing two things:
--total-droplets-included
smaller. It should be pretty much the first droplet where you're 100% sure everything past that is empty.--low-count-threshold
parameter. This will help cellbender more easily identify the empty droplets. In your case, I would set the parameter to 100, telling cellbender than any droplet with < 100 UMI counts is "past the empty droplet plateau" and should be ignored completely. Those droplets probably represent cell barcode sequencing errors, and they are not the "real" empty droplets.So try this:
cellbender remove-background \
--input CellRanger/C120_batch3_5/outs/multi/count/raw_feature_bc_matrix.h5 \
--output /CellBender/mytest/batch3_5_cellbender_out_v2.h5 \
--cuda \
--expected-cells 20000 \
--total-droplets-included 35000 \
--fpr 0.01 \
--low-count-threshold 100
Hey Stephen, thanks for the input. I have increased the threshold and I got some decent correction. The results look very promising. I subsetted the T cells and compared the expression of the most changed genes, and to my surprise I found the contamination genes:
Same goes for ADT, the B cell markers get reduced on T cells, but not the T cell markers(which is amazing):
I also see a reduction in HTO, and my question is should I exclude these from the correction? What is your experience?
Here is the mean change in counts per cell((1-cellbender filtered divided by the cellranger)*100) per assay
thanks in advance, Cheers Alex
Hi @aodainic7 , are those HTOs that you mention "hashtag oligos" like this kind of thing?
If this is what you're talking about, I'd be interested to hear more about your thoughts on this. I have not used these myself, and unfortunately I don't have any experience. The idea is to be able to pool cells across donors by having an (antibody-labeled) oligo barcode whose barcode encodes donor identity, right? And then you load cells from multiple donors into the same "sample", right?
If the HTOs are subject to the same sort of noise mechanisms as the antibody features (and I would expect this to be the case), then maybe running CellBender on those HTO features does make sense.
What I'd do if it were me would be to compare the raw HTO counts and the CellBender HTO counts. And specifically I'd be really interested to see if the conclusions you draw about demultiplexing cells back to their specific donors end up being the same or different when CellBender is used. For example, is it easier for the demultiplexing algorithm to do its job after CellBender cleanup? Does CellBender go too far? Not make a big difference?
I would think it might be kind of like the human and mouse cell benchmark we use: you might see that donor assignment for singlet cells becomes more obvious, but you'd hope to see that true doublets remain doublets in terms of HTO counts after cellbender.
Okay actually, I had another thought that complicates this, although I'll leave what I've written above:
Hello Stephan, exactly the same as is the publication, hashtag oligos for multiplexing. I wanted to investigate the questions you asked. I could not see a very strong effect on smaller cell groups rather in larger ones. The counts get "decontaminated" for one specific HTO, while the rest remain basically unchanged: Interestingly, the changes stay more or less consistent across cell types in the same sample (which is amazing!). Here is an example of one donor: The results look promising, do you have any other critical points I should check?
I have a suggestion, maybe someone would like to exclude the HTOs from the background removal, thus maybe introduce an option to specify when running cellbender. There is only the possibility for --exclude-antibody-capture
, so maybe add --exclude-hashtag-oligos
.
Cheers!
Hi @aodainic7 , nothing else comes to mind, I don't think. I do think that excluding the HTOs might make more sense in your case. In v0.3.0 I will be changing --exclude-antibody-capture
to --exclude-feature-types
where the user can specify any valid feature type. (Currently it has to be one of the types allowed by 10x, which is ['Gene Expression', 'Antibody Capture', 'CRISPR Guide Capture', 'Custom', 'Peaks']
.) When you create this dataset, do you run it through 10x CellRanger to get a count matrix? Does the feature_type show up as Custom
?
That --exclude-feature-types
input argument is now part of the v0.3.0 release.
Hey everyone, I am testing your tool to check for contamination on my scRNAseq+CITEseq experiment. I have one issue and some questions:
Here is the output when I do not omit the ADT, and the pipeline works: batch3_5_cellbender_out_v2.pdf
batch3_1_cellbender_out_v2.pdf batch3_2_cellbender_out_v2.pdf batch3_3_cellbender_out_v2.pdf batch3_4_cellbender_out_v2.pdf
Cheers!