ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
528 stars 111 forks source link

Divide minigraph construction into chunks #1438

Closed glennhickey closed 4 months ago

glennhickey commented 4 months ago

Instead of running a single minigraph -xggs construction call on all input sequences at once, it now runs a sequence of such calls on smaller batches, each in a separate Toil job.

The reasons for doing this are:

Both these issues become more likely now that the HAL limitation on input genomes is fixed.

This is controlled in <graphmap minigraphConstructBatchSize="50"> in the config xml (and defaults to 50 as shown).