sebhtml / ray

Ray -- Parallel genome assemblies for parallel DNA sequencing
http://denovoassembler.sf.net
Other
65 stars 12 forks source link

Ray v2.1.0 gives bad k-mer coverage distributions #86

Closed sebhtml closed 12 years ago

sebhtml commented 12 years ago

/home/seb/2012/bug-ray-2.1-dev

sebhtml commented 12 years ago

the k-mer coverage distribution is wrong

1.7.1 was fine. 1.7.1 was a devel tag, not a stable release. => Ray-human-genome-2011-12-16-k21 => ./Ray-human-genome-2011-12-15-k21/logs/Ray-human-genome-2011-12-15-k21.1/CoverageDistribution.txt

2.1.0-devel is bad => no it is not.

sebhtml commented 12 years ago

now running 2.0.0 to see if it is tainted too

human-genome-2012-10-02-512-k21-v2.0.0.1.1.492

sebhtml commented 12 years ago

2.0.0 => /mnt/scratch_mp2/corbeil/corbeil_group/projects/African-Genome

sebhtml commented 12 years ago

it is not the Bloom filter

/mnt/scratch_mp2/corbeil/corbeil_group/projects/African-Genome/human-genome-2012-09-30-512-no-bloom.1

even without the Bloom filter, the distro is wrong

thus, it is maybe the patch work on the ADD_VERTEX system calls.

sebhtml commented 12 years ago

it may be the incremental resizing code of the table.

sebhtml commented 12 years ago

not the case either,

Rank 1 has 49578604 k-mers (completed)

   -hash-table-buckets buckets
          Sets the initial number of buckets. Must be a power of 2 !
          Default value: 268435456

no incremental resizing occured...

sebhtml commented 12 years ago

I only observe the behavior on SRA000271 on Mammouth.

sebhtml commented 12 years ago

maybe it is the new routing ?

sebhtml commented 12 years ago

this would make sense as it seems that k-mers are not concentrated enough

I added the new routing code on Saturday, still testing it actually.

=> human-genome-2012-10-02-512-no-router-k21-v2.1.0.1.1.505

sebhtml commented 12 years ago

it is not the routing either because this run is bad, but was done before changing the routing code:

/mnt/scratch_mp2/corbeil/corbeil_group/projects/African-Genome/human-genome-2012-09-26-2048.1

Date: Wed Sep 26 11:32:37 2012

the router change was done with:

commit 50c78ae37f9c7aa0d42c4fe50ee28411af79dad0 Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Sat Sep 29 22:40:32 2012 -0400

MessageRouter: store the routing information in the buffer

the previous RayPlatform change was on 2012-09-05.

sebhtml commented 12 years ago

Maybe it is the polytope that is faulty ?

mpiexec -n 2048 Ray \ -o \ human-genome-2012-09-26-2048.1 \ -k \ 21 \ -route-messages \ -connection-type \ polytope \ -routing-graph-degree \ 11

sebhtml commented 12 years ago

So I have to wait for:

v2.0.0-backported-some-patches => human-genome-2012-10-02-512-k21-v2.0.0.1.1.000

v2.1.0-devel without the router =>

mpiexec -n 512 Ray \ -o \ human-genome-2012-10-02-512-no-router-k21-v2.1.0.1 \ -k \ 21 \ -p \ Sample/SRR002271_1.fastq.bz2 \ Sample/SRR002271_2.fastq.bz2 \

2 4048038546 3 1709489922 4 1010986308 5 755092590 6 655842266 7 604972844 8 560367510 9 507524496 10 444596002 11 375876726 12 307143988 13 243536196 14 188462006 15 143344938 16 108006142 17 81334968 18 61817430 19 47792628

=> the problem is not with the message router.

sebhtml commented 12 years ago

Ray v2.0.0 is 6adeef3d814dc2acbc32444ec3ed5a49a709e98c and uses RayPlatform 09517b6862d04743f64abc181de21b7d8c8b5dbd

seb@fault ~/git-clones/ray $ git log v2.0.0..HEAD|grep commit|wc -l 130

seb@fault ~/git-clones/RayPlatform $ git log v1.0.3..HEAD|grep ^commit|wc -l 52

I will need 7 bisection steps.

The regression is not happening on bacterial genomes. This is strange.

For the bisection, Ray will lead and the corresponding RayPlatform (the latest at the corresponding date) will be selected.

sebhtml commented 12 years ago

First, I have to check that v2.0.0 works on SRA000271 too !

sebhtml commented 12 years ago

There are 9 commits for the KmerAcademyBuilder that builds the coverage distribution.

seb@fault ~/git-clones/ray $ git log v2.0.0..HEAD code/plugin_KmerAcademyBuilder/KmerAcademyBuilder.cpp |grep ^commit|wc -l 9

sebhtml commented 12 years ago

commit 06d281d99943da448091142dd0378e90d789b53c => not likely Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Mon Sep 3 01:02:45 2012 -0400

VerticesExtractor: this module extracts vertices to add edges

The plugin VerticesExtractor adds the edges while extracting
vertices. Because it no longer needs to populate the vertices,
the text displayed to the end user was updated.

Signed-off-by: Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>

.../KmerAcademyBuilder.cpp | 14 +++++++------- 1 files changed, 7 insertions(+), 7 deletions(-)

commit 4209dc6f14547e3884cad533f2e318e4c0adc545 => maybe Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Mon Sep 3 00:41:31 2012 -0400

KmerAcademyBuilder: removed the k-mer academy

The k-mer academy is mostly useless because there is the Bloom
filter that does the same thing, but with less memory. The k-mer
academy was created before the Bloom filter. When the Bloom
filter was added, the k-mer academy remained in the code. This
patch corrects this.

Signed-off-by: Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>

.../KmerAcademyBuilder.cpp | 17 ++++++++--------- 1 files changed, 8 insertions(+), 9 deletions(-)

commit 4e763bb3c18cb565ab414350bf8206ecf6b77490 => not likely Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Tue Aug 28 17:33:41 2012 -0400

MessageProcessor: don't discard k-mers while receiving messages

This patch is a fix for the regression introduced by the
recent patch dc65e18327870ad5f57bb75d7b4760e0141e6493.
The plugin KmerAcademyBuilder was modified to be clearer and to fix a bug.
The change was to send any forward k-mer, not just the lower of a pair.
However, the current plugin sends messages to another one, which was still
implementing the old behavior. The handlers
call_RAY_MPI_TAG_VERTICES_DATA and call_RAY_MPI_TAG_KMER_ACADEMY_DATA
were modified accordingly.

Signed-off-by: Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>

.../KmerAcademyBuilder.cpp | 30 ++++++++++++++++--- 1 files changed, 25 insertions(+), 5 deletions(-)

commit dc65e18327870ad5f57bb75d7b4760e0141e6493 => the possible culprit ! Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Tue Aug 28 15:00:39 2012 -0400

KmerAcademyBuilder: only send the forward k-mer, not the lower

This patch has no impact on the result generated by the assembly.
Basically, each k-mer has a reverse-complement and sending both
will increase by one the coverage of both. But in the
implementation, a forward k-mer and its reverse-complement are
stored together to save memory. So, for any forward k-mer occurring
in a read (which can come from the forward strand or from the
reverse-complement strand) sending only the forward k-mer (as is)
is totally equivalent to sending the lowest k-mer between the
forward and the reverse-complement. But the code is clearer if it
sends simply just the forward instead of the lowest.

Signed-off-by: Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>

.../KmerAcademyBuilder.cpp | 31 +++++++++++-------- 1 files changed, 18 insertions(+), 13 deletions(-)

commit d46428ddad00997a431f89c89eb3080c8c0ec8e2 => not likely Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Wed Aug 8 15:04:48 2012 -0400

Porting Ray to the new RayPlatform: Ray compiles with the simplified
RayPlatform adapters now.

Signed-off-by: Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>

.../KmerAcademyBuilder.cpp | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)

commit c13e1bd14cb2d67d91147fd49211221d1a4937b1 => no Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Wed Aug 8 13:59:21 2012 -0400

Porting Ray to the new RayPlatform: remove calls to setObject.

Signed-off-by: Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>

.../KmerAcademyBuilder.cpp | 1 - 1 files changed, 0 insertions(+), 1 deletions(-)

commit 43137b4861784fc0a427ebf35c2bcf2e073dc09a => no Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Wed Aug 8 13:52:16 2012 -0400

Porting Ray to the new RayPlatform: updated the macro names in C++
plugin files.

Signed-off-by: Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>

.../KmerAcademyBuilder.cpp | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)

commit 31b078873c4135d81fb81cc37961a5d685614b26 => no Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Wed Aug 8 13:48:36 2012 -0400

Porting Ray to the new RayPlatform: added CreatePlugin and BindPlugin
instructions.

Signed-off-by: Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>

.../KmerAcademyBuilder.cpp | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-)

commit 746681dc025fc572a24486433995ad1913cc3d4e => no Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Wed Aug 8 13:38:04 2012 -0400

Porting Ray to the new RayPlatform: removed token 'generated_automatically'.

Signed-off-by: Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>

.../KmerAcademyBuilder.cpp | 10 +++++----- 1 files changed, 5 insertions(+), 5 deletions(-)

sebhtml commented 12 years ago

Maybe this one:

NOP

commit dc65e18327870ad5f57bb75d7b4760e0141e6493 => the possible culprit ! Author: Sébastien Boisvert sebastien.boisvert.3@ulaval.ca Date: Tue Aug 28 15:00:39 2012 -0400

KmerAcademyBuilder: only send the forward k-mer, not the lower

This patch has no impact on the result generated by the assembly.
Basically, each k-mer has a reverse-complement and sending both
will increase by one the coverage of both. But in the
implementation, a forward k-mer and its reverse-complement are
stored together to save memory. So, for any forward k-mer occurring
in a read (which can come from the forward strand or from the
reverse-complement strand) sending only the forward k-mer (as is)
is totally equivalent to sending the lowest k-mer between the
forward and the reverse-complement. But the code is clearer if it
sends simply just the forward instead of the lowest.

Signed-off-by: Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>
sebhtml commented 12 years ago

Let's wait for the v2.0.0 result before bisecting.

sebhtml commented 12 years ago

the bz2 loader is fine.

Ray-african-edge-2012-10-02.1.177

1 75694 2 4042657506 3 1707421052 4 1010462022 5 755404938 6 656416132 7 605461336 8 560534488 9 507299578 10 444009922 11 375022844 12 306145716 13 242528388 14 187526244 15 142533306 16 107342374 17 80812806 18 61427958 19 47496974 20 37609450 21 30579314 22 25521264 23 21797182 24 18992920 25 16816590

sebhtml commented 12 years ago

The regression is also in v2.0.0

human-genome-2012-10-02-512-k21-v2.0.0.1/CoverageDistribution.txt

[boisver1@ip03 African-Genome]$ less human-genome-2012-10-02-512-k21-v2.0.0.1/CoverageDistribution.txt 2 3399222890 3 1635667606 4 998467178 5 762442382 6 668953114 7 616686942 8 566705866 9 506945286 10 437444404 11 363794140 12 292251026 13 227965064 14 173804488 15 130517866 16 97457646 17 73089464 18 55559468 19 43147816 20 34427664 21 28241818 22 23780806 23 20468206 24 17932850 25 15935998 26 14314066 27 12988088 28 11840758 29 10870954 30 10012768 31 9271036 32 8605694 33 8019974 34 7492032

sebhtml commented 12 years ago
sebhtml commented 12 years ago

What a waste of time.

I works just fine on colosse

All these guys pass:

./Ray-human-genome-2011-12-16-k21/Ray-human-genome-2011-12-16-k21.1/CoverageDistribution.txt ./Ray-human-genome-2011-12-15-k21/logs/Ray-human-genome-2011-12-15-k21.1/CoverageDistribution.txt ./Ray-Human-512-2012-09-21/CoverageDistribution.txt ./Ray-Human-512-debruijn-2012-09-21/CoverageDistribution.txt ./Ray-Human-512-polytope-2012-09-21/CoverageDistribution.txt

sebhtml commented 12 years ago

according to the documentation, I should use

module add openmpi_gcc64

https://rqchp.ca/?pageId=1228&lang=EN&

sebhtml commented 12 years ago

solved

no problem with:

module load openmpi_gcc64

cause problems:

module load ofed/1.4.1 module load gcc/4.7.0 module load openmpi_gcc64/1.5.4

sebhtml commented 12 years ago

module load openmpi_gcc64/1.6.4

causes the same strange behavior on Mammouth Parallèle II.

marking the mp2 module for OMPI 1.6.2 as tainted.

sebhtml commented 12 years ago

faulty with OMPI 1.6.2 on mp2 (probably a buggy build)

MAXKMERLENGTH: 32 KMER_U64_ARRAY_SIZE: 1 Maximum coverage depth stored by CoverageDepth: 4294967295 MAXIMUM_MESSAGE_SIZE_IN_BYTES: 4000 bytes FORCE_PACKING = n ASSERT = n HAVE_LIBZ = y HAVE_LIBBZ2 = y CONFIG_PROFILER_COLLECT = n CONFIG_CLOCK_GETTIME = n linux = y _MSC_VER = n GNUC = y RAY_32_BITS = n RAY_64_BITS = y MPI standard version: MPI 2.1 MPI library: Open-MPI 1.6.2 Compiler: GNU gcc/g++ 4.7.0 With hardware pop count

sebhtml commented 12 years ago

it is either Open-MPI 1.6.2 or gcc 4.7.0 or something odd on mp2