ariloytynoja / pagan-msa

Automatically exported from code.google.com/p/pagan-msa
3 stars 2 forks source link

what(): std::bad_alloc (bppdist?) #9

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. pagan -s 22918428.fst

What is the expected output? What do you see instead?
 aligning node #1# (1/21): Burkholderia_mallei_NCTC_10229 - Burkholderia_mallei_SAVP1.terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

What version of the product are you using? On what operating system?
PAGAN v.0.56 precompiled, Ubuntu 14.04

Please provide any additional information below.
I assume it's a bug in bppdist. Updating bppdist to the most recent version 
(2.2.0) doesn't help.
Is there a way to force pagan to use another software instead?

Original issue reported on code.google.com by iakov.da...@gmail.com on 31 Mar 2015 at 9:14

GoogleCodeExporter commented 9 years ago

Original comment by iakov.da...@gmail.com on 31 Mar 2015 at 9:15

Attachments:

GoogleCodeExporter commented 9 years ago
The linear-memory algorithm cannot be used for alignment of sequence graphs and 
currently Pagan needs L^2 memory for the alignment. For sequences of 10kb that 
means around 15Gb.

However, if a translated alignment with back-translation to DNA is acceptable, 
the data work fine with the command:

pagan -s 22918428.fst --anchors-offset 100 --translate --outfile 22918428_pagan

This requires Pagan v.20150401.

A proper fix for alignment of long sequences has been planned.

Original comment by ari.loyt...@gmail.com on 1 Apr 2015 at 8:41

GoogleCodeExporter commented 9 years ago
Thank you very much. I didn't realize that it's a memory issue.
So increasing ulimit on my system also worked.

Original comment by iakov.da...@gmail.com on 1 Apr 2015 at 10:12

GoogleCodeExporter commented 9 years ago
UPD: No, increasing ulimit per se doesn't help. Just now it is:
aligning node #7# (7/21): #4# - #6#.terminate called after throwing an instance 
of 'std::out_of_range'
  what():  vector::_M_range_check
Aborted (core dumped)

I will try workaround with back-translation and updated pagan.

Original comment by iakov.da...@gmail.com on 1 Apr 2015 at 10:31

GoogleCodeExporter commented 9 years ago
This suffers from the same anchoring/tunneling problem.

The anchoring speeds up the alignment by defining a tunnel around the probable 
path and then ignoring a large fraction of the dynamic programming (dp) matrix 
in the alignment computation. However, Pagan aligns graphs and the graphs can 
contain edges that point far backwards in the dp matrix.  By default, the 
tunnel is quite narrow and the program currently doesn't check if the backward 
pointers stay within the tunnel and in the valid part of the dp matrix. In some 
cases the anchors (and the tunnel) may also be incompatible with the structure 
and "active part" of the graph. 

Now the only way around this issue is to extend the tunnel width 
(--anchors-offset) or ignore the anchors altogether (--no-anchors). 

The example data work with "--anchors-offset 300" when aligned as DNA and with 
"--anchors-offset 100" when aligned as protein. The DNA alignment requires 
~16Gb of RAM, protein ~2Gb. 

Original comment by ari.loyt...@gmail.com on 1 Apr 2015 at 11:09