sdparekh / zUMIs

zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
GNU General Public License v3.0
269 stars 67 forks source link

Demultiplex option in YAML config Rshiny application #346

Closed Ni-Ar closed 1 year ago

Ni-Ar commented 1 year ago

Hi, I'm using zUMIs for the first time these days. I found out after running a successful test run that I actually need a single bam file for each single cell.

From what I've read in the wiki, zUMIs is capable of turning a big bam file into smaller per-cell demultiplexed bam (based on cell barcode ID I assume). I believe this is achieved by adding demultiplex: yes in the barcodes: section of the yaml config, right?

Is it possible to have a toggle tick box in the Rshiny app to generate yaml files with the demuplex option?

Concering alternative I'm not sure how to split the one big bam files I have. I'm fairly new to this type of analysis and software. I've found some python scripts that could do that but I'm sure they're compatible with zUMIs output.

Thanks, Nicco

cziegenhain commented 1 year ago

Hi there,

Sorry for the slow reply! Exactly, the demultiplex: yes option will make a per-cell .bam during the "Counting" stage. Sorry that the option isn't in the Shiny, to be fully honest it hasn't been updated in a while.

To demultiplex from existing bam file, you could use the code from within zUMIs: python3 zUMIs/misc/demultiplex_BC.py You need to set the following options: --bam - Path to the bam file --out - Path where demultiplexed files go --bc - Path to the .kept_barcodes.txt in the zUMIs_output folder --pin - Number of reading threads --pout - Number of writing threads --chr 'allreads' (demultiplex all chromosomes/reads)

Good luck. Christoph

Ni-Ar commented 1 year ago

Ah I see, nice, thanks for explaining.