HKU-BAL / Clair3

Clair3 - Symphonizing pileup and full-alignment for high-performance long-read variant calling
234 stars 27 forks source link

Unexpected 0/0 records #315

Closed Krannich479 closed 3 months ago

Krannich479 commented 3 months ago

Dear Clair3 dev team,

Background

I am using Clair3 for SNV/indel detection from ONT sequence data. The sequence data is from an amplicon sequencing experiment (high read coverage) of a mixed SARS-CoV-2 sample.

Clair3 call

run_clair3.sh \
    --bam_fn="${bam}" \
    --bed_fn="${params.bed}" \
    --ref_fn="${params.ref}" \
    --threads="${task.cpus}" \
    --platform="ont" \
    --model_path="${params.model}" \
    --snp_min_af=0.08 \
    --indel_min_af=0.15 \
    --no_phasing_for_fa \
    --var_pct_full=1 \
    --ref_pct_full=1 \
    --output="${baseDir}/${params.outdir}"

Issue

I can identify multiple variant sites in IGV that are missing or 0/0 in the resulting VCF files. E.g. below, a SNV with ~0.17 ALT-allele ratio and a 9bp indel with ~150/770 reads supporting ALT Clair3-issue-IGV The only two variant records of full_alignment.vcf.gz in the screenshot region are

NC_045512.2     21618   .       C       .       20.38   RefCall F       GT:GQ:DP:AD:AF  0/0:20:884:692:0.7828
NC_045512.2     21632   .       T       .       9.12    RefCall F       GT:GQ:DP:AD:AF  0/0:9:883:721:0.8165

The record at 21618 has a FILTER=RefCall but has ~0.22 ALT allele ratio according to the VCF record. Since the snp_min_af is 0.08, shouldn't this be some sort of ALT call?

Also, the 21632 position's record is a REF call. With an AF of ~0.19, shouldn't this be some sort of an indel ALT call?

Both records are not within merge_output.vcf.gz. I assume only FILTER=PASS records are making it to the merged output?

Expected behavior

According to my Clair3 call I expected SNVs >0.08 AF to be 0/1 or 1/1 and FILTER equal PASS or LowQual. I expected an indel call at 21632.

Additional info

Clair3 version: 1.0.8 Clair3 model: r1041_e82_400bps_hac_g632 OS: Ubuntu 20.04.6 LTS


Might be related to #314

aquaskyline commented 3 months ago

Please check if the answer to this issue addresses your concern #270,

Krannich479 commented 3 months ago

270 :

For detecting variants with AF<0.2, try ClairS-TO.

I'll try this. Thanks for the pointer!