Open m-jahani opened 3 years ago
May I ask what's the size of the *hic.p_ctg.gfa*
? Does this sample have sex chromosomes?
The size of *asm.hic.p_ctg.gfa is 844829462. Yes, it does have sex chromosome. The target genome is a female plant with XX sex chromosomes.
I personally think your assemblies are already pretty good. It is very hard to make two haplotypes have equal size due to centromeric regions. As for the smaller size, I have no idea if hifiasm really misses some regions or two haplotypes should be such small. Could you please get the Hi-C heatmap or perform contig-to-contig alignment between two haplotypes? Both of these two solutions may tell you if hifiasm miss some regions (although I don't think hifiasm will lose 20Mb contigs for each haplotype).
Thanks for your reply. I will try Hi-C heatmap and/or contig-to-contig alignment.
Another Question. When I decrease the -s parameter to 48, haps sizes are much closer (balance), and BUSCO results are better too:
Information for *asm.hic.hap1.p_ctg.gfa with -s48 total contigs length: 780788341 BUSCO: C:96.4%[S:93.8%,D:2.6%],F:0.4%,M:3.2%,n:2326
information for *asm.hic.hap2.p_ctg.gfa with -s48 total contigs length: 781757043 BUSCO: 97.8%[S:95.2%,D:2.6%],F:0.3%,M:1.9%,n:2326
Do you recommend using -s48? Would not that change other aspects of assembly quality?
Thanks
Are you using -s0.48
or -s48
?
My bad, I meant --hom-cov 48. Would any of -S or --hom-cov work in my case?
Yean, --hom-cov
should be set to hom peak. You can try different values for -s
to see if the results are improved. Hifiasm is pretty fast when bin file has been generated.
Hi, I assemble a diploid plant genome with default HiC mode in HIFIasm (0.15.5-r350). The genome size is expected to be 811M (based on flow cytometry). The results look good, but I would like to push the quality as much as I can.
Here is the result that I got:
Information for *asm.hic.hap1.p_ctg.gfa
total contigs length: 789715320 as % of genome: 96.54 % N50 5443612 BUSCO: C:96.9%[S:94.1%,D:2.8%],F:0.3%,M:2.8%,n:2326
information for *asm.hic.hap2.p_ctg.gfa total contigs length: 776385689 as % of genome: 94.91 % N50 4514560 BUSCO: C:97.6%[S:95.1%,D:2.5%],F:0.3%,M:2.1%,n:2326
information for *asm.hic.p_ctg.gfa total contigs length: 844829462 as % of genome: 103.28 % N50 12490608 BUSCO: C:98.0%[S:91.9%,D:6.1%],F:0.3%,M:1.7%,n:2326
log file:
hifiasm.log
Is it possible to improve my assembly size with tweaking settings? hap1 and hap2 are 789715320 and 776385689, respectively. But the expected genome size is 811000000.
hap1 and hap2 have different sizes, is there any way for balancing haps?
Thanks