ruanjue / wtdbg2

Redbean: A fuzzy Bruijn graph approach to long noisy reads assembly
GNU General Public License v3.0
504 stars 92 forks source link

Separation of contigs by long deletions #241

Closed ziczhang closed 2 years ago

ziczhang commented 2 years ago

Sorry if this is a rudimentary question. I'm using wtdbg2 to assemble a human genome by Nanopore long reads (40Gbase, about 13x). I run wtdbg2 as "wtdbg2 -t $THREADS -x ont -g 3G -i $FASTQ -fo $OUTPUT" and did the polishment. When I mapped both of the contigs and the long reads to a reference human genome, I found some of contigs were separated by long deletions like this figure. image

Can you help with this?

Thanks.

ruanjue commented 2 years ago

If there is a heterozygous INDEL, wtdbg2 might fail to merge them, but break into more than one contigs. In fact, I am not quite understanding your exact question.

ziczhang commented 2 years ago

Thanks Jue. I'm just curious about why this is happening though each reads is still connected in these regions, and if my parameters have something wrong, how I can fix it. But as you said, I should consider heterozygous INDELs.

ruanjue commented 2 years ago

Yes, please check INDELS around.