tanghaibao / jcvi

Python library to facilitate genome assembly, annotation, and comparative genomics
BSD 2-Clause "Simplified" License
743 stars 187 forks source link

Use closely related genomes to further scaffolding genome #396

Closed yplee614 closed 1 month ago

yplee614 commented 3 years ago

Hi, I use allmaps to scaffold the genome based on the maker. But only 78% of contigs were scaffolded to chromosomes. I want to scaffold the remaining contigs that are not scaffolded to the chromosomes using the closely related genomes. There are many tools that can do it, such as pyScaf and ragout. But these tools should provide assembled contigs.

So, can you give me some suggestions about it? Many thanks.

tanghaibao commented 3 years ago

@yplee614

This is a trade-off. Using genetic markers is safer but I understand that the anchor rate is lower. However, introducing synteny into the scaffolding process adds a heavy assumption that is a bit difficult to justify (unless the genomes are very closely related).

If you insist on integrating these two, my suggestion is to start with a synteny-guided approach, and then introduce ALLMAPS.

The synteny approach might introduce chimeric contigs/scaffolds, so you'll need to correct them: https://github.com/tanghaibao/jcvi/wiki/ALLMAPS%3A-How-to-split-chimeric-contigs

Then run ALLMAPS on the corrected contig set, hopefully, the anchor rate would be higher - and since you are using genetic maps as the last evidence and corrected, it is easier to defend than a synteny-only approach.

Haibao