sebhtml / ray

Ray -- Parallel genome assemblies for parallel DNA sequencing
http://denovoassembler.sf.net
Other
65 stars 12 forks source link

Crash with the extension #52

Closed sebhtml closed 12 years ago

sebhtml commented 12 years ago

Ray: plugin_SeedExtender/SeedExtender.cpp:2067: int SeedExtender::chooseWithSeed(): Assertion `false' failed. Error: The seed contains a choice not supported by the graph. Extension length: 177 vertices position=177 AAGCGTGGGGATCAAACAGGATTAGATACCC with 3 choices Seed length: 189 vertices Choices: AAGCGTGGGGATCAAACAGGATTAGATACCA AAGCGTGGGGATCAAACAGGATTAGATACCG AAGCGTGGGGATCAAACAGGATTAGATACCT [r108-n91:01681] * Process received signal * [r108-n91:01681] Signal: Aborted (6) [r108-n91:01681] Signal code: (-6) [r108-n91:01681] [ 0] /lib64/libpthread.so.0 [0x2ac93cacbb70] [r108-n91:01681] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x2ac93cd09265] [r108-n91:01681] [ 2] /lib64/libc.so.6(abort+0x110) [0x2ac93cd0ad10] [r108-n91:01681] [ 3] /lib64/libc.so.6(assert_fail+0xf6) [0x2ac93cd026e6] [r108-n91:01681] [ 4] Ray(_ZN12SeedExtender14chooseWithSeedEv+0x367) [0x511547] [r108-n91:01681] [ 5] Ray(_ZN12SeedExtender8doChoiceEP13RingAllocatorPiP12StaticVectorP4KmerP10BubbleDataiiP13ExtensionDataiP20OpenAssemblerChooserP7ChooserPSt6vectorI12AssemblySeedSaISG_EEPbSK_SK_iS2_SK_PSF_IS5_SaIS5_EE+0x11c5) [0x51a055] [r108-n91:01681] [ 6] Ray(_ZN12SeedExtender29call_RAY_SLAVE_MODE_EXTENSIONEv+0x216) [0x51a936] [r108-n91:01681] [ 7] Ray(_ZN11ComputeCore10runVanillaEv+0x9b) [0x5793ab] [r108-n91:01681] [ 8] Ray(_ZN7Machine5startEv+0x1c26) [0x460286] [r108-n91:01681] [ 9] Ray(main+0x2b) [0x45df1b] [r108-n91:01681] [10] /lib64/libc.so.6(libc_start_main+0xf4) [0x2ac93ccf6994] [r108-n91:01681] [11] Ray(_ZNSt8ios_base4InitD1Ev+0x49) [0x45b739] [r108-n91:01681] * End of error message *

sebhtml commented 12 years ago

mpiexec -n 128 Ray \ -o \ Assembly \ -k \ 31 \ -s \ Sample/SRR072221.fastq.gz \ -s \ Sample/SRR072222.fastq.gz \ -s \ Sample/SRR072223.fastq.gz \ -s \ Sample/SRR072232.fastq.gz \ -s \ Sample/SRR072236.fastq.gz \ -s \ Sample/SRR072237.fastq.gz \ -s \ Sample/SRR072247.fastq.gz \ -s \ Sample/SRR172903.fastq.gz \ -search \ /rap/nne-790-ab/genomes/EMBL_CDS+GO/EMBL_CDS_Sequences \ -gene-ontology \ /rap/nne-790-ab/genomes/EMBL_CDS+GO/000-Ontologies.txt \ /rap/nne-790-ab/genomes/EMBL_CDS+GO/000-Annotations.txt \ -search \ /rap/nne-790-ab/genomes/RayKmerSearchStuff/last-build/ARDB \ -search \ /rap/nne-790-ab/genomes/RayKmerSearchStuff/last-build/Bacteria-Genomes \ -search \ /rap/nne-790-ab/genomes/RayKmerSearchStuff/last-build/HumanChromosomes \ -search \ /rap/nne-790-ab/genomes/RayKmerSearchStuff/last-build/NCBI-Bacteria_DRAFT \ -search \ /rap/nne-790-ab/genomes/RayKmerSearchStuff/last-build/Viruses-Genomes \ -with-taxonomy \ /rap/nne-790-ab/genomes/taxonomy/last-build/Genome-to-Taxon.tsv \ /rap/nne-790-ab/genomes/taxonomy/last-build/TreeOfLife-Edges.tsv \ /rap/nne-790-ab/genomes/taxonomy/last-build/Taxon-Names.tsv

sebhtml commented 12 years ago
sebhtml commented 12 years ago

turning on software message routing will probably solve this strange issue...

2_2_2_2_2_2_2 = 128

sebhtml commented 12 years ago

Error message from the log:

Error: The seed contains a choice not supported by the graph. Extension length: 177 vertices position=177 AAGCGTGGGGATCAAACAGGATTAGATACCC with 3 choices The previous kmer is AAAGCGTGGGGATCAAACAGGATTAGATACC Seed length: 189 vertices Choices: AAGCGTGGGGATCAAACAGGATTAGATACCA AAGCGTGGGGATCAAACAGGATTAGATACCG AAGCGTGGGGATCAAACAGGATTAGATACCT

So according to the extension log, the k-mer AAAGCGTGGGGATCAAACAGGATTAGATACC has 3 children: A G T or AAGCGTGGGGATCAAACAGGATTAGATACCA AAGCGTGGGGATCAAACAGGATTAGATACCG AAGCGTGGGGATCAAACAGGATTAGATACCT

kmer information obtained with Ray -write-kmers:

[sboisver12@colosse1 Sample_X-SilverRay.128.kmers.2]$ grep AAAGCGTGGGGATCAAACAGGATTAGATACC kmers.txt AAAGCGTGGGGATCAAACAGGATTAGATACC;65738;A C G T;A C G T

According to the ASCII dump of the de Bruijn graph, the same k-mer has 4 children

Parents:

[sboisver12@colosse1 Sample_X-SilverRay.128.kmers.2]$ grep AAAAGCGTGGGGATCAAACAGGATTAGATAC kmers.txt AAAAGCGTGGGGATCAAACAGGATTAGATAC;147;A C G;C [sboisver12@colosse1 Sample_X-SilverRay.128.kmers.2]$ grep TAAAGCGTGGGGATCAAACAGGATTAGATAC kmers.txt TAAAGCGTGGGGATCAAACAGGATTAGATAC;18;C G;C [sboisver12@colosse1 Sample_X-SilverRay.128.kmers.2]$ grep CAAAGCGTGGGGATCAAACAGGATTAGATAC kmers.txt CAAAGCGTGGGGATCAAACAGGATTAGATAC;9;G;C [sboisver12@colosse1 Sample_X-SilverRay.128.kmers.2]$ grep GAAAGCGTGGGGATCAAACAGGATTAGATAC kmers.txt GAAAGCGTGGGGATCAAACAGGATTAGATAC;65602;A C G T;A C G T

Children:

[sboisver12@colosse1 Sample_X-SilverRay.128.kmers.2]$ grep AAGCGTGGGGATCAAACAGGATTAGATACCA kmers.txt AAGCGTGGGGATCAAACAGGATTAGATACCA;11;A;G T [sboisver12@colosse1 Sample_X-SilverRay.128.kmers.2]$ grep AAGCGTGGGGATCAAACAGGATTAGATACCT kmers.txt AAGCGTGGGGATCAAACAGGATTAGATACCT;648;A G;C G T [sboisver12@colosse1 Sample_X-SilverRay.128.kmers.2]$ grep AAGCGTGGGGATCAAACAGGATTAGATACCC kmers.txt AAGCGTGGGGATCAAACAGGATTAGATACCC;65536;A C G T;A C G T [sboisver12@colosse1 Sample_X-SilverRay.128.kmers.2]$ grep AAGCGTGGGGATCAAACAGGATTAGATACCG kmers.txt AAGCGTGGGGATCAAACAGGATTAGATACCG;14;A G;C T

Only the C-child has exactly 65536 of coverage. This must be a rare event.

sebhtml commented 12 years ago

Error: The seed contains a choice not supported by the graph. Extension length: 177 vertices position=177 AAGCGTGGGGATCAAACAGGATTAGATACCC with 3 choices The previous kmer is AAAGCGTGGGGATCAAACAGGATTAGATACC Seed length: 189 vertices Choices: AAGCGTGGGGATCAAACAGGATTAGATACCA AAGCGTGGGGATCAAACAGGATTAGATACCG AAGCGTGGGGATCAAACAGGATTAGATACCT m_ed->m_enumeratechoices_outgoingedges.size() -> 3 m_compactEdges -> 1 1 1 1 1 1 1 1

This makes no sense, there is something somewhere that filter out an outgoing edge.

sebhtml commented 12 years ago

Rank A sends RAY_MPI_TAG_REQUEST_VERTEX_COVERAGE with 1 kmer to Rank B.

Rank B responds with 0 for the coverage.

things with 0 are not possible at this point...

sebhtml commented 12 years ago

Fixed 40b9641303c7c71c91a3b636c820568b78e22549