mikolmogorov / maf2synteny

A tool for recovering synteny blocks from multiple alignment
Other
29 stars 7 forks source link

Synteny blocks are very short #9

Open aminakur opened 1 year ago

aminakur commented 1 year ago

I used maf2synteny to extract synteny blocks from Cactus alignement of five genomes with a maximum divergence time between genomes 3 mya. So I expected to see extended blocks of synteny (hundreds of kb to mb). With different -b parameters (I tried from 5000 to 100000) I obtained blocks from 5 to 177 kb. I wonder how could I increase the lengths of the blocks? What simplification parameters could I try?

mikolmogorov commented 1 year ago

Overall, the current Cactus + Ragout setup should certainly work for the genomes with 3 mya divergence (expected sequence-level divergence <1%).

maf2synteny produces somewhat "fine" synteny blocks, which works well for reference assemebly, but may not be ideal for other tasks that want larger blocks. The important parameter for us was the coverage of the blocks, rather than their length.

Overall, multi-way synteny block construction is admittedly a tricky problem, and maf2synteny may not be the ultimate solution. I know that some other groups had better luck with longer blocks by tuning simplification parameters. The algorithm is fairly straightforward and Genome Research paper should give you a good intuition how it works. It may makes sense to try adding additional iterations with more aggressive simplification steps. Or removing some intermediate steps. There is no easy recipe unfortunately.