sanger-pathogens / iva

de novo virus assembler of Illumina paired reads
http://sanger-pathogens.github.io/iva/
Other
53 stars 18 forks source link

Excessive number of iterations #80

Open buchkovm opened 6 years ago

buchkovm commented 6 years ago

I am using IVA to assemble sequence reads from multiple samples from the same virus. While most samples will assemble in hours, some are taking days with hundreds of iterations without ever completing (I killed a few jobs after a week of running). Is this expected behavior? If so, is there a way to limit the number of iterations and successfully output the contigs that were able to be assembled? Any ideas what aspects of the data might cause this behavior?

martinghunt commented 6 years ago

You could try --max_contigs. It will stop IVA making new contigs, but won't stop IVA from trying to keep iterating to extend existing contigs.

The most likely cause is contamination. IVA run time scales with total length of the genome that it is trying to assemble (and read depth). Contamination of a sequence that is long and of enough depth to assemble will make IVA spend a long time trying to assemble it. You could try blasting the contigs it did make against NR and see what it was trying to assemble.