chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
555 stars 88 forks source link

How to deal with extremely high mapping rate? #699

Open oddguyeee opened 3 months ago

oddguyeee commented 3 months ago

I use hifiasm with HiFi reads to construct primary contigs, and scaffolding with 3ddna pipeline. Although I obtain a relatively complete genome, with 92.9% of busco and 95% of primary contigs, the extremely high mapping rate was found by mapping HiFi reads to final assembly. According to log file, the homozygous and heterozygous read coverage threshold were 36 and 18, respectively, as showed in k-mer plot. How can I reset the assembly parameters? image

kiratalreja3 commented 2 months ago

Those could be centromeric/satellite repeat regions. If so, the higher mapping rate is expected. Use tools like Flagger/Inspector to check the collapsed & duplicated regions of the assembly.

oddguyeee commented 2 months ago

Thank you @kiratalreja3 ,

In addition, I have another question that the draft assembly from hifiasm with default parameters show a relatively low busco, 95%, How I can I improve this index?

kiratalreja3 commented 2 months ago

If you have PacBio subreads data, run DeepConsensus for error-correction before assembly with hifiasm. Also make sure you run adapter trimming. After the assembly is done, you can use Inspector/similar tools to perform error correction. Then, after 3DDNA - consider using a gap filling tool like TGS GapCloser to fill the scaffolding gaps. These are some strategies that I use.