BinPro / CONCOCT

Clustering cONtigs with COverage and ComposiTion
Other
120 stars 48 forks source link

Question about parameter used in complete example #205

Closed franciscozorrilla closed 5 years ago

franciscozorrilla commented 5 years ago

Im not sure if this is the best place to leave this question, but here it goes.

On the readthedocs complete example page you guys use a hash length of 71 when running the velveth step. I was wondering how you decided to use this value? and if you would recommend to tweak it based on different data sets being examined?

Thanks, Francisco

alneberg commented 5 years ago

Hi @franciscozorrilla,

this specific question is probably only suitable to @inodb. But I guess it's just to allow slightly longer kmers than the standard velvet.

However, we do not recommend using velvet in a real setting. We recommend using megahit which is a much more modern metagenomics assembler. Sorry for not keeping the tutorial up to date.

inodb commented 5 years ago

@franciscozorrilla This made sense at the time: relatively large kmer to keep the contigs pure. To answer your question: yes, back then you would play around with this depending on your read length and coverage of your population. But like Johannes said, you'll want to try megahit these days.

franciscozorrilla commented 5 years ago

Hey @alneberg @inodb

Thanks for your response! Yes my PI actually suggested the same thing a while ago but I must have forgotten to implement the change in my pipeline. Cheers!