Hi,
I utilized HiFiasm to assemble an insect genome solely using HiFi reads.
I systematically experimented with different -s values, namely 0.75, 0.55, 0.50, 0.45, 0.3, 0.2, and 0.1. Surprisingly, in each case, I always got a contig with 202,096,177 bp in the *.bp.p_ctg.fa file. PS, all assemblies surpassed our anticipated genome size (around 1.1G compared to the estimated 890 Mb).
Considering the BUSCO results, I opted for -s 0.50 in the following command:
hifiasm -o ALE_LA_hom-cov92_0.5 -t 48 -s 0.5 --hom-cov 92 --write-paf --write-ec ./ ../ALE_LA_hifireads.filt.NOmtDNA.fastq.gz 2>HiFiasm.log
I mapped the HiFi reads back to the super-long contig, and found a region with 0 mapping depth.
In IGV, the first track is .hap1.p_ctg.fa to .bp.p_ctg.fa; the second is .hap2.p_ctg.fa to .bp.p_ctg.fa; the third is the HiFi reads (utilized during assembly) to *.bp.p_ctg.fa.
Here are my questions:
(1) I'm curious about how a contig featuring a gap can lack any mapped reads.
(2) Is it advisable to split the contig at this gap?
(3) Can I rely on regions with a mapping depth of only 1 or 2?
(4) I'm contemplating using --b-cov and --m-rate to filter the depth, but I'm uncertain about their practical application. For instance, if I set --b-cov 3, does this imply breaking the contig when the read mapping depth is below 3, ensuring resulting contigs have a minimum depth of 3? Regarding --m-rate, if I set --b-cov 3 --m-rate 0.8, I'm unclear about how the contig will be affected by these parameters.
Your insights on these questions would be greatly appreciated.
HiFiasm.log
Hi, I utilized HiFiasm to assemble an insect genome solely using HiFi reads. I systematically experimented with different -s values, namely 0.75, 0.55, 0.50, 0.45, 0.3, 0.2, and 0.1. Surprisingly, in each case, I always got a contig with 202,096,177 bp in the *.bp.p_ctg.fa file. PS, all assemblies surpassed our anticipated genome size (around 1.1G compared to the estimated 890 Mb).
Considering the BUSCO results, I opted for -s 0.50 in the following command: hifiasm -o ALE_LA_hom-cov92_0.5 -t 48 -s 0.5 --hom-cov 92 --write-paf --write-ec ./ ../ALE_LA_hifireads.filt.NOmtDNA.fastq.gz 2>HiFiasm.log
I mapped the HiFi reads back to the super-long contig, and found a region with 0 mapping depth. In IGV, the first track is .hap1.p_ctg.fa to .bp.p_ctg.fa; the second is .hap2.p_ctg.fa to .bp.p_ctg.fa; the third is the HiFi reads (utilized during assembly) to *.bp.p_ctg.fa.
Here are my questions: (1) I'm curious about how a contig featuring a gap can lack any mapped reads. (2) Is it advisable to split the contig at this gap? (3) Can I rely on regions with a mapping depth of only 1 or 2? (4) I'm contemplating using --b-cov and --m-rate to filter the depth, but I'm uncertain about their practical application. For instance, if I set --b-cov 3, does this imply breaking the contig when the read mapping depth is below 3, ensuring resulting contigs have a minimum depth of 3? Regarding --m-rate, if I set --b-cov 3 --m-rate 0.8, I'm unclear about how the contig will be affected by these parameters.
Your insights on these questions would be greatly appreciated. HiFiasm.log