Closed yt619 closed 6 months ago
Sorry for my busy; This is a good question, and currently, CRAQ does not involve haplotype switch evaluation. I have been pondering this question for a long time, but unfortunately, I still cannot provide a definitive answer. However, some outputs from CRAQ may offer some reference.
For diploid organisms, CRAQ will report the true biological differences between two haplotypes as CRH/CSH, and assembly errors as CRE/CSE. Users can input haplotype 1 and haplotype 2 separately to detect assembly errors (following: CRAQ -g hap1.fa -sms sms.fq -ngs NGS.R1.fq,NGS.R2.fq & CRAQ -g hap2.fa -sms sms.fq -ngs NGS.R1.fq,NGS.R2.fq).
However, the evaluation of haplotype phasing also involves the detection of haplotype switches (a region was assembled into hap1, but actually a part of hap2), the feature currently lacking in CRAQ. The low accuracy of ONT sequencing makes me more frustrated. I believe the current best approach about phasing evaluation would really rely on the availability of two parental genomes as references.
For the assembly of homologous polyploids, phasing presents the greatest challenge. Is it possible to identify phasing assemblies using ONT data and to disassemble erroneous assemblies? We hope you can assist us in resolving such issues, as haplotype resolution in polyploids is critically important. The accuracy of ONT reads appears to be only around Q18, so can we avoid such erroneous identifications at the parameter level?
Thank you very much for your attention to this matter.
Sincerely, Tuo Yang