sebhtml / ray

Ray -- Parallel genome assemblies for parallel DNA sequencing
http://denovoassembler.sf.net
Other
65 stars 12 forks source link

Assembler panic: no peak observed in the k-mer coverage distribution. #42

Closed mscook closed 12 years ago

mscook commented 12 years ago

Hi Seb,

I've run into this.

Full log pasted here: http://pastebin.com/t3XWDiXu

Ray version 1.7 installed via-

module load OpenMPI/1.5.3 module load compiler/gcc-4.5.2 make MAXKMERLENGTH=101 cp -r * ../../Ray/1.7/

Cheers

Mitch

sebhtml commented 12 years ago

Hi !

Ray used to require a clear peak in the coverage distribution to compute what is unique in the genome, and what is repeated (and which k-mers don't spell a genomic word but instead an erroneous k-mer).

We modified heavily Ray so that it now works without any peak in the coverage distribution. That means metagenome assemblies and transcriptome assemblies with Ray too !

We are preparing a paper about that.

Can you try with Ray v2.0.0-beta6 (just clone the master branch).

The command:

git clone git://github.com/sebhtml/ray.git

In the last couple of weeks, we have also been modifying the parallel architecture of Ray to make it as modular as Linux, using plugins. This is almost done.

The Ray parallel genome assembler uses the Ray Platform -- a parallel programming framework with MPI.

You can create other parallel programs easily with the Ray Platform too. A stable API for the Ray Platform will soon be available.

By the way, if you want to do parallel programming with the Ray Platform, check the example here: https://github.com/sebhtml/RayPlatform-example

-seb