cancerit / PCAP-core

NGS reference implementations and helper code for mapping (originally part of ICGC-TCGA-PanCancer)
GNU General Public License v2.0
9 stars 9 forks source link

bam_stats - passthrough (CRAM) #12

Closed keiranmraine closed 4 years ago

keiranmraine commented 6 years ago

If CRAM is set as the output format we currently have to run bam_stats as a separate process after the CRAM file is written to disk (scramble used for easily tuning of cram compression). For BAM files we are able to have bam_stats read from tee'ed output.

This isn't possible due to the combination of commands used for CRAM, but if bam_stats was able to pass the input data direct to stdout when an output file for the BAS data is provided and a passthrough flag is enabled we could insert this into the pipeline and save a disk read.

Under cram the read and process from disk is pretty heavyweight compared to having bam_stats read from an uncompressed stream.

Possibly worth adding thread pool for decompression threads as part of this.

keiranmraine commented 6 years ago

... looks like its possible to work around this with the example provided by rob:

https://github.com/samtools/samtools/issues/774#issuecomment-365550905

keiranmraine commented 4 years ago

This has been handled for bwa_mem.pl (which is what this was referencing)