BUStools / bustools

Tools for working with BUS files
https://bustools.github.io/
BSD 2-Clause "Simplified" License
92 stars 23 forks source link

Barcode list for DropSeq #10

Closed davemcg closed 5 years ago

davemcg commented 5 years ago

Very excited at the kallisto/bustools release! Just in time for my re-analysis project.

Would you suggest creating a whitelist by running UMITools on the fastq from DropSeq or just skipping the bustools correct step?

https://github.com/CGATOxford/UMI-tools/blob/master/doc/Single_cell_tutorial.md#Variations

lakigigar commented 5 years ago

There is a bustools whitelist command you can use to extract a whitelist from the barcodes (that you can afterwards correct with bustools correct). The whitelist command is implemented in a fork (https://github.com/laureneliu/bustools) so you would have to compile from source. It's currently under review on a pull request and will be official with the next release of bustools. I can tell you it has been extensively tested, although not with Drop-seq data, so if you try it feedback would be appreciated.

davemcg commented 5 years ago

Wow this is slick (and for others reading, https://github.com/laureneliu/bustools/commit/a199123faa25138b7ed7f5f75d3ce9145e54d7c0 can compile on OS X). I assume that you've hard-coded the barcode/UMI patterning into kallisto bus?

This is a really easy workflow to get matrices of counts by cell/gene that extends across multiple technologies with just one flag in one tool (-x TECHNOLOGY)!

kallisto bus r1.fq r2.fq -x TECHNOLOGY -i INDEX -o sample
bustools sort sample.bus -o sample.sorted.bus
bustools whitelist sample.sorted.bus -o WHITELIST.txt
bustools correct sample.sorted.bus -w WHITELIST.txt -o sample.sorted.correct.bus
bustools count # well just go see https://www.kallistobus.tools/getting_started and https://www.kallistobus.tools/documentation

(this is ignoring the fact that these all should pipe together for clarity)

lakigigar commented 5 years ago

Many of the frequently used technologies are hardcoded but you can also specify custom patterning (see the kallisto manual for details: https://pachterlab.github.io/kallisto/manual).

davemcg commented 5 years ago

What?! (checks HPC kallisto version). Oh, I've been using 0.45.

Not that I have to do this (thank goodness), but you would give the custom bc:umi:seq info with the -x flag?

https://pachterlab.github.io/kallisto/manual#bus

lakigigar commented 5 years ago

Yes.