Nextomics / NextDenovo

Fast and accurate de novo assembler for long reads
GNU General Public License v3.0
350 stars 52 forks source link

Reproducibility of results #152

Closed dlavrov closed 1 year ago

dlavrov commented 1 year ago

Question or Expected behavior I ran NextDenovo three times using the same control and input files and it produced three different assemblies that varied in the number of scaffolds (159-181), the total size 157-160Mbp and the size of the largest scaffold 6-13 Mbp. What is the reason for this variation and is there any way to make the assembly reproducible (e.g., by entering a "random" number)?

Operating system Red Hat Enterprise Linux Server release 7.9 (Maipo)

NextDenovo What version of NextDenovo are you using? 2.5.0

moold commented 1 year ago

Hi, it is hard to make the assembly reproducible, two reasons,

  1. Some multi-thread modules have different output order each time.
  2. Unstable sorting algorithms are widely used by NextDenovo.