voutcn / megahit

Ultra-fast and memory-efficient (meta-)genome assembler
http://www.ncbi.nlm.nih.gov/pubmed/25609793
GNU General Public License v3.0
585 stars 134 forks source link

Question: how megahit deals with different size read sets? #336

Open Valentin-Bio-zz opened 2 years ago

Valentin-Bio-zz commented 2 years ago

Hello , I just want to understand what happens to the iterative assembly on each k-mer step. Lets say I have a read set of different lengths. the minimum read length is of 40bp and the longer reads are of 150bp. (the great majority of the reads are on the 130-150 bp range). So if I let megahit choose the k-mer sizes how it deals with the shorter reads? I was taking a look into the log file of a running assembly and I see this "k-max reset to: 141" this means that the last iteration is going to use a k-mer size of 141bp. So I assume that shorter reads are discarded due to k-mer > shorter input reads and it will employ the previous assembled contigs plus the input reads that are equal or longer to the last k-mer size ?