Open xyxy0053 opened 2 years ago
Are both haplotypes larger around 400Mb? Is your sample diploid?
About 360MB for two haplotypes and 390+MB for the final genome. The most troublesome issue should be the over longer contigs I got. According to the reference genome of the homologous species, the pseduo-chromosomes range from 20MB to 50MB. However, I got two contigs with nearly 90MB length. For hap1 and hap2, I didn't find super-long contigs.
@chhylp123
To avoid misassemblies, please see https://hifiasm.readthedocs.io/en/latest/faq.html#how-do-i-avoid-misassemblies. I would also recommend you conduct Hi-C scaffolding to fix potential misassemblies directly.
To avoid misassemblies, please see https://hifiasm.readthedocs.io/en/latest/faq.html#how-do-i-avoid-misassemblies. I would also recommend you conduct Hi-C scaffolding to fix potential misassemblies directly.
Thanks. I will try "-D" and "--purge-max" options. Additionally, according to my situation (too long contig), if I should set larger values for "-D"? And, how should I set "--purge-max" option for better assemblies and the default value of "--purge-max"?
@chhylp123
It would be better to set a little bit larger value for -O
and enable -u
.
I am using hifiasm with default parameters to assemble one plant genome (HIFI sequencing and HI-C) with estimated genome size of 330MB. However, I got an assembly with size of nearly 400MB. Particularly, several supercontigs are over longer compared with the homologous species with high quality reference genomes. Therefore, I would like to ask if there is way to address this issue?