rrwick / Unicycler

hybrid assembly pipeline for bacterial genomes
GNU General Public License v3.0
565 stars 131 forks source link

assembly for a plant genome #101

Closed bioteksampath closed 6 years ago

bioteksampath commented 6 years ago

Hi

I wonder, How good this with more bigger genomes (interested in using for a plant genome with 500 MB size). Do I need to do any modification with the commend or so?

Thanks sampath

rrwick commented 6 years ago

Unfortunately, I fear Unicycler is doomed to fail on a big plant genome for two reasons. First, it was developed with a haploid genome in mind, so diploid genomes (or worse, polyploid plant genomes) could really mess it up. Second, Unicycler isn't very fast and I think some of its algorithms will take forever on a large genome.

If you have both long and short reads, I'd suggest doing a long-read-only assembly (e.g. with Canu) and then polishing it with the short reads. Alternately, MaSuRCA is an assembler that can do hybrid assemblies on big genomes, but I haven't tried it myself.

Ryan