alekseyzimin / masurca

GNU General Public License v3.0
242 stars 35 forks source link

Different output for the same input #74

Open eleniadam opened 5 years ago

eleniadam commented 5 years ago

Dear @alekseyzimin,

I am running Masurca and having as an input: Illumina PE reads + Nanopore long reads.

I ran the masurca with the same configuration file twice (in two seperate folders) and instead of getting the same output result (final.genome.scf.fasta), I got two different ones. This happens even when I use only 1 thread.

Why is this happening? Shouldn't the output of running the same configuration file (input parameters) always be the same?

Thank you, Eleni

alekseyzimin commented 5 years ago

Hi,

Due to high error rates in nanopore reads, there is some uncertainty in correcting them, because sometimes there are two paths through the assembly graph constructed from Illumina data that are equally consistent with nanopore reads and the code picks one path at random. Thus the final assemblies could be slightly different on the same data.

Almost all assemblers that use nanopore data will experience the same behavior.

Best, Aleksey

On Wed, Oct 24, 2018, 11:37 AM Eleni Adam notifications@github.com wrote:

Dear @alekseyzimin https://github.com/alekseyzimin,

I am running Masurca and having as an input: Illumina PE reads + Nanopore long reads.

I ran the masurca with the same configuration file twice (in two seperate folders) and instead of getting the same output result (final.genome.scf.fasta), I got two different ones. This happens even when I use only 1 thread.

Why is this happening? Shouldn't the output of running the same configuration file (input parameters) always be the same?

Thank you, Eleni

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/74, or mute the thread https://github.com/notifications/unsubscribe-auth/AZ9zHRmoE-j00WKHRF88SylIjpF5k9eYks5uoIkhgaJpZM4Xzt0o .