JiaoLaboratory / CRAQ

Identification of errors in draft genome assemblies with single-base pair resolution for quality assessment and improvement
https://doi.org/10.1038/s41467-023-42336-w
MIT License
53 stars 5 forks source link

Usage guidance #18

Open LGG02 opened 3 months ago

LGG02 commented 3 months ago

Hi

I have a phased assembly and binned HiFi reads while binning short reads are difficult because of length. I am using binned hifi data and unbinned short reads for evaluation. How does the unbinned short reads affect the results?

JiaoLaboratory commented 2 months ago

CRAQ is suitable for evaluating individual phased genomes, such as haplotype1 and haplotype2. Users can input all unbinned reads (all reads) into CRAQ. Compared to using binned reads, evaluating phased genomes with unbinned short reads may lead CRAQ to report more composite heterozygous regions (CRHs) (potentially misidentifying switch errors as heterozygous regions), but it theoretically has little impact on assembly errors such as CSEs (used to calculate S-AQI) and CREs (used to calculate R-AQI). Users can input all unbinned reads into CRAQ.