sebhtml / ray

Ray -- Parallel genome assemblies for parallel DNA sequencing
http://denovoassembler.sf.net
Other
65 stars 12 forks source link

missing edges ? #75

Closed sebhtml closed 11 years ago

sebhtml commented 11 years ago

Setup:

genomeX.fasta / 74 kb

16 k sequences, 1% error

Coverage distribution:

2 246698 3 6086 4 7978 5 14022 6 19264 7 21844 8 21526 9 18366 10 15104 11 10252 12 6788 13 3956 14 2158 15 1034 16 518 17 190 18 58 19 30 20 28 21 6

degree distribution

(0,0) should be even

(0,1) and (1,0) should be the same

0 0 225649 0 1 10869 0 2 0 0 3 0 0 4 0 1 0 10862 1 1 148206 1 2 159 1 3 0 1 4 0 2 0 0 2 1 159 2 2 2 2 3 0 2 4 0 3 0 0 3 1 0 3 2 0 3 3 0 3 4 0 4 0 0 4 1 0 4 2 0 4 3 0 4 4 0

This bug can cause this:

Error: The seed contains a choice not supported by the graph.

Because any edge should have its reverse complement edge too

sebhtml commented 11 years ago

If I turn off the null edge purger, the edge distribution becomes:

mpiexec -output-filename degree-bug \ -n $NSLOTS ~/Ray -p u1.fasta u2.fasta -o degree-bug \ -bloom-filter-bits 0

patch:

diff --git a/code/plugin_EdgePurger/EdgePurgerWorker.cpp b/code/plugin_EdgePurger/EdgePurgerWorker.cpp
index 367edd7..d6e0eca 100644
--- a/code/plugin_EdgePurger/EdgePurgerWorker.cpp
+++ b/code/plugin_EdgePurger/EdgePurgerWorker.cpp
@@ -48,7 +48,7 @@ void EdgePurgerWorker::work(){

                                CoverageDepth coverage=response[0];

-                               if(coveragegetMinimumCoverageToStore()){
+                               if(false &&coveragegetMinimumCoverageToStore()){
                                        m_vertex->deleteIngoingEdge(&m_currentKmer,&vertex,m_parameters->getWordSize());
                                }
                                m_iterator++;
@@ -81,7 +81,7 @@ void EdgePurgerWorker::work(){

                                CoverageDepth coverage=response[0];

-                               if(coveragegetMinimumCoverageToStore()){
+                               if(false &&coveragegetMinimumCoverageToStore()){
                                        m_vertex->deleteOutgoingEdge(&m_currentKmer,&vertex,m_parameters->getWordSize());
                                }
                                m_iterator++;

Result:

0 0 5 0 1 6383 0 2 0 0 3 0 0 4 0 1 0 6174 1 1 363860 1 2 9155 1 3 197 1 4 2 2 0 0 2 1 9155 2 2 738 2 3 18 2 4 0 3 0 0 3 1 197 3 2 18 3 3 2 3 4 0 4 0 0 4 1 2 4 2 0 4 3 0 4 4 0

Therefore the problem is before the purging of null edges.

sebhtml commented 11 years ago

With fix:

0 0 0 0 1 6179 0 2 0 0 3 0 0 4 0 1 0 6179 1 1 364064 1 2 9155 1 3 197 1 4 2 2 0 0 2 1 9155 2 2 738 2 3 18 2 4 0 3 0 0 3 1 197 3 2 18 3 3 2 3 4 0 4 0 0 4 1 2 4 2 0 4 3 0 4 4 0

sebhtml commented 11 years ago

dda2d53775b63582eb0e9907b5b8fd5841f1a5b4