GATB / gatb-minia-pipeline

GATB Minia assembly pipeline
29 stars 8 forks source link

Problems with assembling the data beyond kmer 127 and other queries. #17

Closed harish0201 closed 5 years ago

harish0201 commented 5 years ago

Hi!

I'm using the gatb-minia pipeline to get the contigs for a couple of datasets that I have. The assembly runs fine until the kmer 121 (I have taken step 10).

When it jumps towards the kmer 131, I get an error: EXCEPTION: Failure because of unhandled kmer size 131

Are there any compile-time options that I can use to assemble at 131 and 141?

Also how does the pipeline chose the assembly to scaffold? Does it take the last kmer+contigs.fa as an input? Or do you employ other metrics as well (N50, median length etc)?

rchikhi commented 5 years ago

Hi Harish,

Oh it's my bad, In the latest gatb-pipeline update the minia binary didn't support large k values. This will be fixed in an hour or so.

Regarding scaffolding, you guessed right. It uses the contig file from the last k value. That last one, due to the multi-k assembly principle, is always supposed to be the "best" assembly.

harish0201 commented 5 years ago

Ah, thanks for the update @rchikhi!

I was worried if I pulled the wrong release or forgot some compile time options ala Velvet!

rchikhi commented 5 years ago

There are indeed some compile-time options in minia but gatb-pipeline provides you with a pre-compiled minia.

Apologies the delay, I'm trying to push a commit that updates minia with k up to 256. Github disallows large binaries and somehow minia's binary got much bigger than 100 MB so I have to work around that.

rchikhi commented 5 years ago

The correct minia binary (that supported up to k=256) has been put on the repository.