chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
534 stars 87 forks source link

Are these assembly errors? #405

Closed zhangyixing3 closed 1 year ago

zhangyixing3 commented 1 year ago

Dear author, I am now assembling an autopolyploid . Its ploidy is about 10. I meet some trouble in scaffolding. I use utg.fa and HIC data to partition.but I faild,more than 20000 Contigs are allocated together (in fact, we want these Contigs to be allocated to different groups). Then I use SALSA to find misassemble contig, I did find some sequences that might be incorrectly assembled. SALSA has found 295 potential mistakes, and I believe its results are conservative. It is also quite difficult to adjust the misassemble contig. The polyploid is a little different from the previous situation https://github.com/baozg/phased-assembly-check . Here are some of these contig's HIC heatmaps. Can I use appropriate parameters to reduce these misassemble ? My parameters are hifiasm -o my -t 30 -D 10 -N 200 read1.fq read2.fq read3.fq Thank you ! utg000126l utg001363l

chhylp123 commented 1 year ago

Are you using unitigs extracted from the p_utg.gfa? If you are using p_utg.gfa, there might be a few misassemblies. I guess misassemblies should not be a majoir issue.

zhangyixing3 commented 1 year ago

I am using p_utg.gfa. Besides I used TRF to find repetitive sequences, and found that>utg000126l: 7760000-8730000 was composed of tandem sequences, while>utg001363l: 4387500-4612500 and>utg001363l: 900000-10125000 were not