chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
548 stars 87 forks source link

upgrade assembly continuty and LAI score #132

Open DO-T opened 3 years ago

DO-T commented 3 years ago

Hi! I have used hifiasm(0.14-r312) on HiFi-reads only to assemble an allotetraploid plant(coverage ~50x). The options : -t48 -l0 were used in this assembly. All output files were generated successfully. The hifi-assembled-genome contig N50 reaches to ~13Mb and BUSCO score reaches to ~98%. The LTR Assembly Index(LAI), however, was less than 10. My sample was highly homozygous and its ancestors were diverged about 5 Million year ago respectively. In fact, we assembled a version1 genome by Illumina paired-end reads(150bp) with PacBio reads and scaffolded with Hi-C data. The Version1 genome LAI score was ~15. By the way, the final hifi-assembly genome size was larger than our exceptation from k-mer analysis about 50Mb. How can I upgrade my genome and any options or methods recommend?

oushujun commented 3 years ago

For polyploid genomes, you need to use the -mono parameter to evaluate subgenomes independently, otherwise, you will get overcorrected LAI values due to high LTR identity between subgenomes, resulting in low LAI. See discussions here: https://github.com/oushujun/LTR_retriever/issues/83

For the Illumina assembly, my wild guess is that it may not have homeologous LTRs assembled and resolved between subgenomes, so that the genome-wide LTR identity is lower than expectation and the final LAI correction overcorrected it to an unrealistic high value.

Shujun

DO-T commented 3 years ago

We will retest it in this way and thanks for your help!