This PR contains a couple more patches to allow pangenome alignments with tons of samples to go through.
Spanning tree calculation is used to fit outgroups into a subtree, and seems to be quadratic (like outgroup selection itself) in the number of species. This PR disables this computation when it's not needed (which is the case for pangenomes).
abPOA's "progressive" mode is on by default. It sorts input sequences using a distance matrix which is quadratic in input genome count as well, and therefore doesn't scale past about 10k. This PR adds a limit in the config, partialOrderAlignmentProgressiveMaxRows, which defaults to 5000, and makes sure progressive is disabled in abPOA for any inputs with more than this many rows. I'm using 5000 even though it can go a bit higher (at least double) because even if it doesn't crash, it gets real slow in the thousands.
This PR contains a couple more patches to allow pangenome alignments with tons of samples to go through.
partialOrderAlignmentProgressiveMaxRows
, which defaults to5000
, and makes sure progressive is disabled in abPOA for any inputs with more than this many rows. I'm using5000
even though it can go a bit higher (at least double) because even if it doesn't crash, it gets real slow in the thousands.