Closed dawe closed 6 months ago
Just look at the unfiltered results to match oligonucleotides to sample names — that’s the easiest way to go. kb count outputs unfiltered results (e.g. there will be a cells_x_genes.barcodes.txt file in your unfiltered results that contains your unfiltered oligonucleotides that you can match to your sample names).
Excellent, thanks
Hello @Yenaled , sorry for reopening the issue. I have checked and
$ wc -l samples
28 samples
$ wc -l $OUTDIR/counts_unfiltered/cells_x_genes.barcodes.txt
21 RUN799_783/counts_unfiltered/cells_x_genes.barcodes.txt
so 7 cells are missing. I ran kb count
with threshold set to 0, I found a whitelist.txt
file which also doesn't match
$ wc -l $OUTDIR/whitelist.txt
29 RUN799_783/whitelist.txt
with an extra cell, matrix.cells
file contains 28 cells (with the sample name in the original samples
file)
$ wc -l $OUTDIR/matrix.cells
28 RUN799_783/matrix.cells
The file matrix.sample.barcodes
should be the appropriate one.
Nevertheless, how can I disable bustools filtering?
^ah yes, that is correct.
bustools doesn’t do any filtering — what might be happening is that some of your “cells” are having 0 reads being mapped in which case they can’t appear in the final matrix.
I've solved specyfying a fixed barcode whitelist like
AAAAAAAAAAAAAAAT
AAAAAAAAAAAAACCT
AAAAAAAAAAAAAAAA
AAAAAAAAAAAAACGA
AAAAAAAAAAAAACGG
AAAAAAAAAAAAAAAC
AAAAAAAAAAAAAAAT
AAAAAAAAAAAAAAGA
AAAAAAAAAAAAAAGA
AAAAAAAAAAAAACAT
…
That is generated by kb itself
Hello, I am processing single cells obtained by SMARTSEQ. I have created a 3-columns tsv file with cell-id and the two read pairs (
samples
). I launchedkb count
likeI get the
h5ad
file and it contains less than expected cells, this because filtering has been applied (next time I will apply 0-threshold). However, my issue is that cell names in theAnnData
are oligonucleotides, whereas the first column in thesamples
file contains sample names. I assume the cell order inAnnData
is the same specified in thesamples
file, however it's not clear how to identify sample correspondences and which cells have been filtered.