CRAM IO seems to be vastly faster when the reference genome is explicitly passed. These appears to be because an explicitly passed reference genome is trusted where a reference genome linked from within the file is validated to some extent
In MCHap, this means the reference path should be passed to each process that reads bam/cram files. It also means that an optional reference argument should be added to tools which don't currently require a reference genome.
CRAM IO seems to be vastly faster when the reference genome is explicitly passed. These appears to be because an explicitly passed reference genome is trusted where a reference genome linked from within the file is validated to some extent
Relevant issues:
In MCHap, this means the reference path should be passed to each process that reads bam/cram files. It also means that an optional reference argument should be added to tools which don't currently require a reference genome.