It's a very useful tool, but to achieve good genome assembly results, a deep understanding of the software is still necessary. Recently, I used HiFi data to prepare for assembling a genome of about 2.8G in size. The statistics for the HiFi data are as follows:
|format|type|num_seqs|sum_len|min_len|avg_len|max_len| |FASTQ|DNA|7,926,350|196,547,172,078|173|24,796.7|73,674|
the command I used is as follow:
hifiasm -o ${output} -t 32 --hg-size 3.0g ${input}.fastq
the logs (attached here, log.txt) show that the peak_hom: 66; peak_het: 64 are similar (will this affect the assembly result?), and the assembled genome is slightly larger and has several thousand contigs. I'm not sure which key parameters I may have overlooked that led to this (for a diploid genome with 60X HiFi data). Additionally, I would like to know if there are other tools available to assess the reliability of the current assembly results. Can I align the sequencing data with the assembled genome to check the alignment rate, genome coverage depth, and the number of large structural variants, especially homozygous structural variants? Do you have any other assessment methods? Thank you very much.
log.txt
Dear @chhylp123
It's a very useful tool, but to achieve good genome assembly results, a deep understanding of the software is still necessary. Recently, I used HiFi data to prepare for assembling a genome of about 2.8G in size. The statistics for the HiFi data are as follows:
|format|type|num_seqs|sum_len|min_len|avg_len|max_len|
|FASTQ|DNA|7,926,350|196,547,172,078|173|24,796.7|73,674|
the command I used is as follow:
hifiasm -o ${output} -t 32 --hg-size 3.0g ${input}.fastq
the logs (attached here, log.txt) show that the
peak_hom: 66; peak_het: 64
are similar (will this affect the assembly result?), and the assembled genome is slightly larger and has several thousand contigs. I'm not sure which key parameters I may have overlooked that led to this (for a diploid genome with 60X HiFi data). Additionally, I would like to know if there are other tools available to assess the reliability of the current assembly results. Can I align the sequencing data with the assembled genome to check the alignment rate, genome coverage depth, and the number of large structural variants, especially homozygous structural variants? Do you have any other assessment methods? Thank you very much. log.txtBest wishes, Zheng zhuqing