chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
528 stars 86 forks source link

After adjusting the parameters based on the information from the Hifiasm.log file, the issue of "phased assembly is much larger than the estimated genome size" still persists. #582

Open xujialupaoli opened 9 months ago

xujialupaoli commented 9 months ago

Hello, When assembling the genome of a species using HiFi data with hiFiasm, I encountered a peculiar situation. The genome size of my species is approximately 1.1 G. Following your software recommendations and based on the k-mer analysis results from the log file, we set the parameters as -s 0.35 --hom-cov 24. However, the output assembly results show two haplotypes with sizes of 5.3 G and 5.4 G, respectively. I hope you can help me understand why this is happening and how to address it effectively.

My command hifiasm -o /home/work/enlian/hifiasm_result_5_onlyhifi/5_hifiasm_ONHF -t 64 -s 0.35 --hom-cov 24 /home/work/enlian/m84114_231110_103510_s2.hifi_reads.bc2059.fq /home/work/enlian/m84114_231113_054910_s2.hifi_reads.bc2059.fq /home/work/enlian/m84114_231121_083911_s1.hifi_reads.fastq

image image

hifiasm_ONHF.txt

chhylp123 commented 9 months ago

Do you know the homozygous coverage of your dataset? By looking at the k-mer plot, the main issue is that the dataset is not so clean around 2 or 3. A good k-mer plot should like https://hifiasm.readthedocs.io/en/latest/faq.html#why-does-hifiasm-stuck-or-crash.