ComparativeGenomicsToolkit / taffy

This is a library C/Python/CLI for working with TAF (.taf,.taf.gz) and MAF (.maf) alignment files
MIT License
23 stars 3 forks source link

Use abPOA for gap alignments #41

Closed glennhickey closed 6 months ago

glennhickey commented 7 months ago

This replaces the wfa-based gap sequence aligner with abPOA. The default cactus scoring parameters are hardcoded in. It passes all tests but I haven't yet tested it at scale. In particular it will only work to about ~100kb -- should be fine in practice but at the very least needs an explicit check.

@benedictpaten do you mind testing it builds and runs for you?

As an aside, it wouldn't even pass the tests until I sorted the abpoa input by length -- the problem case is below:

printf ">1\nNNNNNNNNNNNNN\n>2\nNNNNNNNNNNNNNNNNNNN\n>3\nNNNNNNNNNNNNNN\n" > x.fa
taffy/submodules/abPOA/bin/abpoa -r 1 -m 0 x.fa
[main] CMD:  taffy/submodules/abPOA/bin/abpoa -r 1 -m 0 x.fa
>1
NNNNNNNNNNNNN-------
>2
NNNNNNNNNNNNN-NNNNNN
>3
NNNNNNNNNNNNNN------
[abpoa_main] Real time: 0.001 sec; CPU: 0.005 sec; Peak RSS: 0.004 GB.

but sorting works

printf ">2\nNNNNNNNNNNNNNNNNNNN\n>3\nNNNNNNNNNNNNNN\n>1\nNNNNNNNNNNNNN\n" > x.sort.fa
taffy/submodules/abPOA/bin/abpoa -r 1 -m 0 x.sort.fa
[main] CMD:  taffy/submodules/abPOA/bin/abpoa -r 1 -m 0 x.sort.fa
>2
NNNNNNNNNNNNNNNNNNN
>3
NNNNNNNNNNNNNN-----
>1
NNNNNNNNNNNNN------
[abpoa_main] Real time: 0.001 sec; CPU: 0.005 sec; Peak RSS: 0.004 GB.
benedictpaten commented 6 months ago

I fixed the makefile, builds fine on my machine. But the Python build is messed up by the inclusion of abPOA. I'm not quite sure the right way to fix it.

benedictpaten commented 6 months ago

Can you try building the Python part on your machine?:

https://github.com/ComparativeGenomicsToolkit/taffy?tab=readme-ov-file#installing-python-library-from-source

benedictpaten commented 6 months ago

actually, maybe if I remove the merge_adjacent_alignments and have that in a separate header file we can keep a separation from the code in the python library...

glennhickey commented 6 months ago

oof, yeah I hadn't quite gotten to the python build. I'm getting a fatal error: abpoa.h: No such file which I imagine is your problem too? Taking a look now...

benedictpaten commented 6 months ago

Okay.. this all works on my mac (i.e. builds/tests okay)