aaranyue / quarTeT

A telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification
http://atcgn.com:8080/quarTeT/home.html
101 stars 7 forks source link

Assessing the genome after filling the gaps showed that qv values are reduced #27

Closed doubleHwithT closed 7 months ago

doubleHwithT commented 10 months ago

Dear Author, I hope this message finds you well! Firstly, thank you so much for developing the wonderful tool, It helped me a lot on the gapless genome assembly.

I found a problem in the recent gap filling process. I assessed the genome before filling gaps and the qv value was 65, after filling gaps the genome qv became to 57. The filled sequences is very small, about 3,600 kb (genome size around 316 Mb). genome bone assembled with hifi data alone and adjusted by juicer (assembled genome and after juicer's genome qv is not changed). The material for filling gaps is the hifi + ultra-long ONT assembled genome or contigs.

I would like to ask what is the cause of the qv drop, and if it is caused by filling gaps, is there any other way to avoid it?

Thank you so much!

Best wishes~

Bruce

Echoring commented 10 months ago

I'm not very sure about this. If you use Merqury to assess the genome, according to the defination, QV = -10*log10(1-(1-Kasm/Ktotal)^(1/k)). If QV drops, it likely means that K-mer only find in assembly rise a lot. This should only happen on the filling point, may due to an unproper filling. You can try more strict parameter (increase -l or decrease -f), or check the filled gap manually.