ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
528 stars 111 forks source link

Only use abPOA's progressive mode if input lengths all (about) the same #1509

Closed glennhickey closed 1 month ago

glennhickey commented 1 month ago

It's been a couple of times where it seems that abPOA's "progressive" input sorter (-p) which is on by default in cactus only makes things worse.

Instead of giving up on it entirely (I'm sure I've seen it help before), progressive sorting is now only activated if the input sequences are all the same length. The logic being that we prefer sorting by length, but there's nothing much to lose trying progressive when there is no length signal.

A threshold (defaulting to 5% in the config) is used to determine if the lengths are the same.