alekseyzimin / masurca

GNU General Public License v3.0
243 stars 35 forks source link

Bug report #183

Open CIWa opened 4 years ago

CIWa commented 4 years ago

Hi @alekseyzimin,

While running Masurca (version 3.3.9) I encountered an error:

./assemble.sh
[Mon 27 Jul 20:37:59 CEST 2020] Processing pe library reads
[Mon 27 Jul 20:37:59 CEST 2020] Average PE read length 146
[Mon 27 Jul 20:38:00 CEST 2020] Using kmer size of 99 for the graph
[Mon 27 Jul 20:38:00 CEST 2020] MIN_Q_CHAR: 33
[Mon 27 Jul 20:38:00 CEST 2020] Estimated genome size: 5511448095
[Mon 27 Jul 20:38:00 CEST 2020] Computing super reads from PE
[Mon 27 Jul 20:38:00 CEST 2020] Using Flye from MaSuRCA-3.3.9/bin/../Flye/bin
[Mon 27 Jul 20:38:00 CEST 2020] Running mega-reads correction/assembly
[Mon 27 Jul 20:38:00 CEST 2020] Using mer size 15 for mapping, B=17, d=0.029
[Mon 27 Jul 20:38:00 CEST 2020] Estimated Genome Size 5511448095
[Mon 27 Jul 20:38:00 CEST 2020] Estimated Ploidy 2
[Mon 27 Jul 20:38:00 CEST 2020] Using 128 threads
[Mon 27 Jul 20:38:00 CEST 2020] Output prefix mr.41.15.17.0.029
[Mon 27 Jul 20:38:00 CEST 2020] Pacbio coverage >25x, using 25x of the longest reads
[Mon 27 Jul 20:40:45 CEST 2020] Refining alignments
ERROR: failed to merge alignments at position 401
       Please file a bug report
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: failed to merge alignments at position 126
       Please file a bug report
xargs: refine.sh: exited with status 255; aborting
ERROR: failed to merge alignments at position 409
       Please file a bug report
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: Could not parse delta file, /dev/stdin
error no: 402
[Mon 27 Jul 23:15:46 CEST 2020] Computing allowed merges
cat: mr.41.15.17.0.029.all.txt: No such file or directory
[Mon 27 Jul 23:15:46 CEST 2020] computing allowed merges failed
[Mon 27 Jul 23:15:46 CEST 2020] Assembly with flye failed, see files under flye/

RAM was no shortage at the point of failure. I checked for an error in the mummer-installation as mentioned in #4. I re-installed Masurca and logged stderr and stdout, and searched for errors and warnings regarding mummer, but could not find anything. Furthermore, no mummer is installed elsewhere on the server. I then ran an assembly on the test data set from the Masurca blogspot and that one worked perfectly well with my current installation. Would you have any suggestion what might be the problem?

Best wishes, Isabel

CIWa commented 4 years ago

Update: I upgraded to version 3.4.1, and the same happens:

./assemble.sh
[Tue 28 Jul 18:19:18 CEST 2020] Processing pe library reads
[Tue 28 Jul 18:19:18 CEST 2020] Average PE read length 146
[Tue 28 Jul 18:19:18 CEST 2020] Using kmer size of 99 for the graph
[Tue 28 Jul 18:19:19 CEST 2020] MIN_Q_CHAR: 33
[Tue 28 Jul 18:19:19 CEST 2020] Estimated genome size: 5511448095
[Tue 28 Jul 18:19:19 CEST 2020] Computing super reads from PE
[Tue 28 Jul 18:19:19 CEST 2020] Using Flye from MaSuRCA-3.4.1/bin/../Flye/bin
[Tue 28 Jul 18:19:19 CEST 2020] Running mega-reads correction/assembly
[Tue 28 Jul 18:19:19 CEST 2020] Using mer size 15 for mapping, B=17, d=0.029
[Tue 28 Jul 18:19:19 CEST 2020] Estimated Genome Size 5511448095
[Tue 28 Jul 18:19:19 CEST 2020] Estimated Ploidy 2
[Tue 28 Jul 18:19:19 CEST 2020] Using 128 threads
[Tue 28 Jul 18:19:19 CEST 2020] Output prefix mr.41.15.17.0.029
[Tue 28 Jul 18:19:19 CEST 2020] Pacbio coverage >25x, using 25x of the longest reads
[Tue 28 Jul 18:22:46 CEST 2020] Refining alignments
ERROR: failed to merge alignments at position 401
       Please file a bug report
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: failed to merge alignments at position 126
       Please file a bug report
ERROR: failed to merge alignments at position 409
       Please file a bug report
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: failed to merge alignments at position 823
       Please file a bug report
ERROR: failed to merge alignments at position 438
       Please file a bug report
ERROR: failed to merge alignments at position 389
       Please file a bug report
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: failed to merge alignments at position 485
       Please file a bug report
ERROR: failed to merge alignments at position 1648
       Please file a bug report
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: failed to merge alignments at position 431
       Please file a bug report
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: failed to merge alignments at position 371
       Please file a bug report
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: Could not parse delta file, /dev/stdin
error no: 402
ERROR: failed to merge alignments at position 419
       Please file a bug report
[Tue 28 Jul 23:25:53 CEST 2020] Computing allowed merges
[Wed 29 Jul 00:12:20 CEST 2020] Joining
read sequence for  not found
read sequence for  not found
read sequence for  not found
read sequence for  not found
read sequence for  not found
read sequence for  not found
[Wed 29 Jul 01:18:39 CEST 2020] Gap consensus
[Wed 29 Jul 08:01:45 CEST 2020] Running assembly with Flye
[Wed 29 Jul 08:01:45 CEST 2020] Assembly with flye failed, see files under flye/

However, this time the installation gave me a warning regarding the mummer installation:

libtool: install: warning: `libumdmummer.la' has not been installed in `MaSuRCA-3.4.1/lib'
libtool: install: warning: relinking `swig/perl5/mummer.la'

But I still don't have any mummer libraries on the server besides Masurca's, and again the test data set ran perfectly fine:

./assemble.sh 
[Thu 30 Jul 10:57:57 CEST 2020] Processing pe library reads
[Thu 30 Jul 10:58:13 CEST 2020] Average PE read length 250
[Thu 30 Jul 10:58:14 CEST 2020] Using kmer size of 127 for the graph
[Thu 30 Jul 10:58:14 CEST 2020] MIN_Q_CHAR: 33
WARNING: JF_SIZE set too low, increasing JF_SIZE to at least 187720586, this automatic increase may be not enough!
[Thu 30 Jul 10:58:14 CEST 2020] Creating mer database for Quorum
[Thu 30 Jul 10:58:38 CEST 2020] Error correct PE
[Thu 30 Jul 10:59:26 CEST 2020] Estimating genome size
[Thu 30 Jul 10:59:40 CEST 2020] Estimated genome size: 13813470
[Thu 30 Jul 10:59:40 CEST 2020] Creating k-unitigs with k=127
[Thu 30 Jul 11:00:55 CEST 2020] Computing super reads from PE 
[Thu 30 Jul 11:02:07 CEST 2020] Using CABOG from MaSuRCA-3.4.1/bin/../CA8/Linux-amd64/bin
[Thu 30 Jul 11:02:07 CEST 2020] Running mega-reads correction/assembly
[Thu 30 Jul 11:02:07 CEST 2020] Using mer size 15 for mapping, B=17, d=0.029
[Thu 30 Jul 11:02:07 CEST 2020] Estimated Genome Size 13813470
[Thu 30 Jul 11:02:07 CEST 2020] Estimated Ploidy 2
[Thu 30 Jul 11:02:07 CEST 2020] Using 126 threads
[Thu 30 Jul 11:02:07 CEST 2020] Output prefix mr.41.15.17.0.029
[Thu 30 Jul 11:02:07 CEST 2020] Pacbio coverage <25x, using the longest subreads
[Thu 30 Jul 11:02:09 CEST 2020] Reducing super-read k-mer size
[Thu 30 Jul 11:02:38 CEST 2020] Mega-reads pass 1
[Thu 30 Jul 11:02:38 CEST 2020] Running locally in 1 batch
[Thu 30 Jul 11:06:55 CEST 2020] Mega-reads pass 2
[Thu 30 Jul 11:06:55 CEST 2020] Running locally in 1 batch
[Thu 30 Jul 11:07:41 CEST 2020] Refining alignments
[Thu 30 Jul 11:07:55 CEST 2020] Computing allowed merges
[Thu 30 Jul 11:07:56 CEST 2020] Joining
[Thu 30 Jul 11:07:58 CEST 2020] Gap consensus
[Thu 30 Jul 11:07:58 CEST 2020] Generating assembly input files
[Thu 30 Jul 11:08:34 CEST 2020] Coverage threshold for splitting unitigs is 15 minimum ovl 250
[Thu 30 Jul 11:08:34 CEST 2020] Running assembly
[Thu 30 Jul 11:11:00 CEST 2020] Mega-reads initial assembly complete
[Thu 30 Jul 11:11:00 CEST 2020] No gap closing possible
[Thu 30 Jul 11:11:00 CEST 2020] Removing redundant scaffolds
[Thu 30 Jul 11:11:12 CEST 2020] Assembly complete, final scaffold sequences are in CA.mr.41.15.17.0.029/final.genome.scf.fasta
[Thu 30 Jul 11:11:12 CEST 2020] All done
[Thu 30 Jul 11:11:12 CEST 2020] Final stats for CA.mr.41.15.17.0.029/final.genome.scf.fasta
N50 807382
Sequence 12233031
Average 611652
E-size 845084
Count 20