ruanjue / wtdbg2

Redbean: A fuzzy Bruijn graph approach to long noisy reads assembly
GNU General Public License v3.0
513 stars 94 forks source link

What does ‘share k-mers’ mean? #194

Closed phoenix-jiaqi closed 4 years ago

phoenix-jiaqi commented 4 years ago

Hi. In paper,'Bins/boxes with the same color suggest they share k-mers'. What does ‘share k-mers’ mean?

phoenix-jiaqi commented 4 years ago

Is that different bins have a same k-mer at least ?

ruanjue commented 4 years ago

When there is no sequencing error and repetitve sequences, bins from the same genomic region are identical, sharing all kmers. If sequencing high error rate, only a few k-mers are identical between 'identical' bins. If has repetitive sequences, different bins might share k-mers. However, wtdbg2 find matching bins in a dynamic programming way, which tolerates errors and repetitive seqs. The figure iluminates a basic framework of wtdbg2.