implement a progressive alignment integration algorithm

xingjianleng / DBGA

The repository for the genome sequence alignment research project

BSD 3-Clause "New" or "Revised" License

3 stars 1 forks source link

The limitation of dbga is that it requires a k that approaches the threshold of all k-mers within a sequence are unique, opposed by the requirement that the number of shared k-mers between sequences is maxmised.

The more sequences being aligned, the more this will (ultimately) increase the number of bubbles. My proposed "hack" is to:

get a crude guide tree
select triples through traversing the guide tree with each triple having a sequence in common with surrounding triples
align triples of sequences
merge triples with shared sequences using the align to reference capabilities of cogent3

xingjianleng / DBGA

implement a progressive alignment integration algorithm #23