While we have examples from training for processing alevin files, and they are well-supported by tximport; we did not have similar results for processing kallisto|bustools files. Kallisto also seems to do much less aggressive filtering of cells when creating the count matrix, so we have to do more on our own.
This notebook provides an example workflow for reading in the relevant result files after counting with bustools. I show three different methods for identifying cells. The current standard of using a knee plot is fast, so that seems likely to be the preferred method going forward.
A more "principled" method is implemented in DropletUtils which I have also tested here. It is MUCH slower, but may be worth considering, if we think that recovering cells with low total expression may be important.
While we have examples from training for processing alevin files, and they are well-supported by
tximport
; we did not have similar results for processing kallisto|bustools files. Kallisto also seems to do much less aggressive filtering of cells when creating the count matrix, so we have to do more on our own.This notebook provides an example workflow for reading in the relevant result files after counting with bustools. I show three different methods for identifying cells. The current standard of using a knee plot is fast, so that seems likely to be the preferred method going forward.
A more "principled" method is implemented in
DropletUtils
which I have also tested here. It is MUCH slower, but may be worth considering, if we think that recovering cells with low total expression may be important.