HingeAssembler / HINGE

Software accompanying "HINGE: Long-Read Assembly Achieves Optimal Repeat Resolution"
http://genome.cshlp.org/content/27/5/747.full.pdf+html?sid=39918b0d-7a7d-4a12-b720-9238834902fd
Other
64 stars 9 forks source link

segmentation error draft #86

Closed alimayy closed 7 years ago

alimayy commented 7 years ago

Hi @fxia22, last Friday I complied the dev branch and I'm getting a segmentation fault error at the draft assembly stage. The old binary works fine. Let me know if I can help with troubleshooting.

...
...
...
T 29900 0 18940 0 12551
T 18940 0 64634 1 12121
T 64634 1 82532 1 11983
T 82532 1 26315 1 13473
T 26315 1 74944 1 13115
T 74944 1 15618 0 13322
T 15618 0 35434 0 13207
T 35434 0 44799 1 16938
E 44799 1 82360 0 3587 662
S 14312 1 6967 1 21870 0
E 6967 1 14312 1 21522 16053
S 14312 0 6967 0 21522 0
E 6967 0 14312 0 21870 16053
Error! Wrong format.

/mnt/nfs/programs/HINGE-dev/src/hinge: line 8: 31091 Segmentation fault      draft_assembly $*
fxia22 commented 7 years ago

Hi, @alimayy because we changed the format of path file, can you run hinge draft-path first? Thanks.

alimayy commented 7 years ago

Hi @fxia22, I was running hinge draft-path first. The stdout is

.....
.....
*********Executing the command**************
hinge draft-path ./ 9_1_19376 9_1_193769_1_19376_run_id.G2.graphml
------------------------------------------------
Number of contigs: 22

*********Executing the command**************
hinge draft --db 9_1_19376 --las 9_1_19376.las --prefix 9_1_19376 --config /mnt/nfs/programs/HINGE-dev/utils/nominal.ini --out 9_1_19376.draft
------------------------------------------------
[2016-11-18 16:17:27.422] [log] [info] draft consensus
[2016-11-18 16:17:27.422] [log] [info] name of db: 9_1_19376, name of .las file 9_1_19376.las
[2016-11-18 16:17:27.422] [log] [info] name of fasta: , name of .paf file 
[2016-11-18 16:17:27.422] [log] [info] filter files prefix: 9_1_19376
[2016-11-18 16:17:27.422] [log] [info] output prefix: 9_1_19376.draft
[2016-11-18 16:17:27.422] [log] [info] Parameters passed in 

[filter]
length_threshold = 1000;
quality_threshold = 0.23;
n_iter = 3; // filter iteration
aln_threshold = 1000;
min_cov = 5;
cut_off = 300;
theta = 300;
use_qv = true;

[running]
n_proc = 12;

[draft]
min_cov = 10;
trim = 200;
edge_safe = 100;
tspace = 900;
step = 50;

[consensus]
min_length = 4000;
trim_end = 200;
best_n = 1;
quality_threshold = 0.23;

[layout]
hinge_slack = 1000
min_connected_component_size = 8

[2016-11-18 16:17:27.450] [log] [info] Load alignments from 9_1_19376.las
[2016-11-18 16:17:27.450] [log] [info] # Alignments: 19474838
[2016-11-18 16:17:27.451] [log] [info] # Reads: 89433
[2016-11-18 16:17:39.644] [log] [info] Input data finished
[2016-11-18 16:17:39.645] [log] [info] LENGTH_THRESHOLD = 1000
[2016-11-18 16:17:39.645] [log] [info] QUALITY_THRESHOLD = 0.23
[2016-11-18 16:17:39.645] [log] [info] ALN_THRESHOLD = 1000
[2016-11-18 16:17:39.645] [log] [info] MIN_COV = 5
[2016-11-18 16:17:39.645] [log] [info] CUT_OFF = 300
[2016-11-18 16:17:39.645] [log] [info] THETA = 300
[2016-11-18 16:17:39.645] [log] [info] N_ITER = 3
[2016-11-18 16:17:39.645] [log] [info] THETA2 = 0
[2016-11-18 16:17:39.645] [log] [info] N_PROC = 12
[2016-11-18 16:17:39.645] [log] [info] HINGE_SLACK = 1000
[2016-11-18 16:17:39.645] [log] [info] HINGE_TOLERANCE = 150
[2016-11-18 16:17:39.645] [log] [info] KILL_HINGE_OVERLAP_ALLOWANCE = 300
[2016-11-18 16:17:39.645] [log] [info] KILL_HINGE_INTERNAL_ALLOWANCE = 40
[2016-11-18 16:17:39.645] [log] [info] MATCHING_HINGE_SLACK = 200
[2016-11-18 16:17:39.645] [log] [info] MIN_CONNECTED_COMPONENT_SIZE = 8
add data
add data
S 81423 0 58911 1 12087 0
T 58911 1 75962 1 10362
T 75962 1 73394 1 12351
......
S 14312 1 6967 1 21870 0
E 6967 1 14312 1 21522 16053
S 14312 0 6967 0 21522 0
E 6967 0 14312 0 21870 16053
Error! Wrong format.
......

And the stderr is

.....
.....
LAmerge 9_1_19376.5 L1.5.1 L1.5.2 L1.5.3 L1.5.4 L1.5.5 L1.5.6
LAmerge 9_1_19376.6 L1.6.1 L1.6.2 L1.6.3 L1.6.4 L1.6.5 L1.6.6

LAcheck -vS 9_1_19376 9_1_19376.1
  9_1_19376.1: 3,621,378 all OK
LAcheck -vS 9_1_19376 9_1_19376.2
  9_1_19376.2: 3,655,510 all OK
LAcheck -vS 9_1_19376 9_1_19376.3
  9_1_19376.3: 3,652,201 all OK
LAcheck -vS 9_1_19376 9_1_19376.4
  9_1_19376.4: 3,610,230 all OK
LAcheck -vS 9_1_19376 9_1_19376.5
  9_1_19376.5: 3,639,923 all OK
LAcheck -vS 9_1_19376 9_1_19376.6
  9_1_19376.6: 1,295,596 all OK

rm L1.1.1.las L1.1.2.las L1.1.3.las L1.1.4.las L1.1.5.las L1.1.6.las
rm L1.2.1.las L1.2.2.las L1.2.3.las L1.2.4.las L1.2.5.las L1.2.6.las
rm L1.3.1.las L1.3.2.las L1.3.3.las L1.3.4.las L1.3.5.las L1.3.6.las
rm L1.4.1.las L1.4.2.las L1.4.3.las L1.4.4.las L1.4.5.las L1.4.6.las
rm L1.5.1.las L1.5.2.las L1.5.3.las L1.5.4.las L1.5.5.las L1.5.6.las
rm L1.6.1.las L1.6.2.las L1.6.3.las L1.6.4.las L1.6.5.las L1.6.6.las
/mnt/nfs/programs/HINGE-dev/src/hinge: line 8: 21532 Segmentation fault      draft_assembly $*
hinge draft --db 9_1_19376 --las 9_1_19376.las --prefix 9_1_19376 --config /mnt/nfs/programs/HINGE-dev/utils/nominal.ini --out 9_1_19376.draft did not produce a return code of 0, quiting!
fxia22 commented 7 years ago

Hi @alimayy sorry about that, can you try that again?

alimayy commented 7 years ago

@fxia22 Thanks a lot, it works now. Before closing the issue, can you maybe briefly comment on what the bug was in draft that you fixed (I mean not the one I reported above, but the one that made you make a change in the first place)? Also, are there news about the paper (i.e. new version of the pri-print)? Any major version release soon?

By the way, the HINGE version of last week did my last assembly in 53 minutes. The current version in 28 minutes. Almost twice as fast. What did you guys change?

fxia22 commented 7 years ago

@alimayy Before there is a bug which will cause cutting problems in short contigs, I attempted to fix that. According your feedback in #71, it might not be completely fixed yet, I will do more tests.

We are doing some system level optmization on HINGE, including making memory footprint smaller, and that caused the performance boost in HINGE.

ilanshom commented 7 years ago

@alimayy, regarding the paper, it is currently under review. We should be hearing back from them soon and based on the feedback we will update the paper. But it will probably take us a few weeks until we have a new version.

alimayy commented 7 years ago

Congratulations @fxia22, seems like you can close this one.

What used to give me this error is now a 28 MB fungal HINGE assembly in 5.5 hours (on a relatively new architecture, 24 cpus). Correct genome size and GC%.

fxia22 commented 7 years ago

Thanks for your feedback @alimayy