mikolmogorov / Flye

De novo assembler for single molecule sequencing reads using repeat graphs
Other
743 stars 164 forks source link

Is the assamble.fasta is redundant without "keep-haplotypes" #701

Closed YouxinZhao closed 1 month ago

YouxinZhao commented 1 month ago

flye --nano-corr ~/zhaobaojun/t2t_genome/4_necat/cns_final.fasta.gz --genome-size 2.6g --threads 100 --out-dir flye_necat_out

I used flye to assamble genome without "--keep-haplotypes". I got a genome with 3.7G genome size I want to know if the genome is redundant and i need to use a tool, ex "purge ups", to obtain the genome

mikolmogorov commented 1 month ago

Yes, --keep-haplotypes is designed to retain alternative alleles in the assembly. They should be sufficiently different (e.g. many mismatches of structural variations). It is difficult to say based on the assembly size, but if you are not sure, running purge haplotigs may be a good idea.

Hope this helps, Misha