kids-first / kf-alignment-workflow

:microscope: Alignment workflow for Kids-First DRC
Apache License 2.0
10 stars 6 forks source link

BAM Preprocessing optimization suggestion #49

Closed bogdang989 closed 6 years ago

bogdang989 commented 6 years ago

Tools in the workflow

Current function

Split input BAM/SAM/CRAM into BAM per RG.

Proposed modification

Use Samtools split for this purpose.

Performance improvement

Picard RevertSam time 4h 7m Samtools Split time 28m

Example command line

/opt/samtools-1.7/samtools split -f '%!.bam' -@ 29 ae3b4fcd963d404081393b9cf038d4d5.bam

Samtools split is tested with a randomly selected BAM from the pilot set. QualityYield metrics show that there is no difference in tool outputs, which is expected since it is a simple tool.