ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
523 stars 111 forks source link

Fix awk arithmatic issue due to large contig sizes #1127

Closed glennhickey closed 1 year ago

glennhickey commented 1 year ago

cactus-graphmap-split uses awk at some point to make a contig size table.

The problem is that some versions of awk, including the one in the cactus docker image (mawk) return scientific notation for numbers > 2Gb (32bits), and this trips up Cactus which casts it to int.

This PR is a little patch that specifies float output, which seems consistent across awk flavours and number sizes and returns output readable by the existing code.

In no way does this mean that cactus-pangenome will now work well on giant genomes... it's just this one particular crash that should be fixed.

Resolves #1125