HingeAssembler / HINGE

Software accompanying "HINGE: Long-Read Assembly Achieves Optimal Repeat Resolution"
http://genome.cshlp.org/content/27/5/747.full.pdf+html?sid=39918b0d-7a7d-4a12-b720-9238834902fd
Other
64 stars 9 forks source link

Out of range error in draft_assembly step #65

Closed ghost closed 8 years ago

ghost commented 8 years ago

I get the following error in draft_assembly step after the process is running for quite a while:

...
In total 450 lanes
0
482901
list size:0
list size:0
list size:0
list size:0
list size:0
list size:0
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr: __pos (which is 13829) > this->size() (which is 10798)
Aborted

Maybe it is connected to #64 ? The process is running for quite a while until I encounter the error. I have about 2.5gb of data in the initial Fasta file, however I do not think that it is related to the amount of data.

What further do you need for debugging?

fxia22 commented 8 years ago

Can you show us the first several lines of .edges.list in your folder? How many >Unitig does this file have? Thank you.

ghost commented 8 years ago

cat pk.edges.list | grep '>Unitig' | wc -l gives me 90. The file has a size of 81K.

And this is the start of my *.edges.list:

>Unitig0
2356 1 4678 1 45461 6934 29905 1311 23801
4678 1 3340 1 13930 11581 18668 0 6843
3340 1 7239 1 22807 7376 18965 11630 22848
7239 1 2285 0 39355 0 19328 0 20027
2285 0 2726 0 34930 4568 22229 0 17269
2726 0 4521 0 44352 0 23045 9992 31299
4521 0 6082 0 52400 0 26287 1356 27469
6082 0 6948 1 46265 3418 26655 3954 26982
6948 1 3984 0 43512 0 21567 0 21945
3984 0 2683 0 40778 7638 28684 0 19732
2683 0 312 0 33630 7874 24124 593 17973
312 0 2329 0 37770 2903 21682 0 18991
2329 0 4371 1 59781 0 31155 1211 29837
4371 1 7462 0 37048 0 19634 0 17414
7462 0 563 1 36577 11352 30641 976 18264
>Unitig1
2356 0 4681 0 4427 28414 30643 4125 6323
>Unitig2
1290 0 4293 0 10465 0 5379 9471 14557
4293 0 4054 0 4864 10365 12718 0 2511
4054 0 460 1 16471 1999 10487 5509 13492
460 1 3100 0 11712 0 5891 0 5821
3100 0 5944 0 17248 10908 19571 0 8585
5944 0 4948 0 25075 10130 22704 0 12501
4948 0 5987 1 8029 10107 14197 13470 17409
5987 1 3296 1 11280 7106 12916 8885 14355
3296 1 5753 0 11969 0 6080 3912 9801
5753 0 1560 1 17958 6727 16125 12568 21128
1560 1 1691 0 33013 0 17215 0 15798
1691 0 7739 0 24989 0 12479 17603 30113
7739 0 6844 0 40597 5087 25322 0 20362
6844 0 6741 1 34744 3283 20782 7260 24505
6741 1 7669 0 11471 0 5472 0 5999
7669 0 1590 0 39345 875 19702 0 20518
1590 0 1898 0 24218 0 11977 11308 23549
1898 0 2013 1 31430 1914 17615 11033 26762
2013 1 2066 1 34034 0 17119 7034 23949
2066 1 3642 0 15188 0 7568 0 7620
3642 0 1529 1 40971 4930 25692 5132 25341
1529 1 5567 0 15747 0 8245 0 7502
>Unitig3
...
fxia22 commented 8 years ago

Thank you for your reply. These few lines should be fine, but do you have empty Unitigs, that which is >Unitig* followed by an empty line or followed by nothing?

For example, the output of the command: cat pk.edges.list | grep -A 1 '>Unitig'

ilanshom commented 8 years ago

@MichaelsGITIGIT, if you want, you can send us the .edges.list file. You wouldn't be sharing any data and it may be easier for us to figure out the issue

ghost commented 8 years ago

pk.edges.list.zip

ghost commented 8 years ago

Hi @all

  1. Attached the *.edges.list in the comment before
  2. I do NOT see any >Unitig* followed by empty line or nothing. Maybe you double check What I do see is many >Unitig* with just one short line following
govinda-kamath commented 8 years ago

Hi @MichaelsGITIGIT,

Could you also send us the pk.G2.graphml file? You wouldn't be sharing any data.

ghost commented 8 years ago

Here you go...

pkid.G2.graphml.zip

fxia22 commented 8 years ago

Hi @MichaelsGITIGIT, we did some change to draft_assembly, can you have a try and see if that solves your problem?

Thanks!

ghost commented 8 years ago

Now it runs through without any problem. Just did it one time, but seems to be solved now. I'll close the thread. Thanks!