PlantandFoodResearch / MCHap

Polyploid micro-haplotype assembly using Markov chain Monte Carlo simulation.
MIT License
18 stars 3 forks source link

Improve CRAM performance by passing reference fasta #167

Closed timothymillar closed 1 year ago

timothymillar commented 1 year ago

CRAM IO seems to be vastly faster when the reference genome is explicitly passed. These appears to be because an explicitly passed reference genome is trusted where a reference genome linked from within the file is validated to some extent

Relevant issues:

In MCHap, this means the reference path should be passed to each process that reads bam/cram files. It also means that an optional reference argument should be added to tools which don't currently require a reference genome.

timothymillar commented 1 year ago

Fixed in #169