HKU-BAL / Clair3

Clair3 - Symphonizing pileup and full-alignment for high-performance long-read variant calling
239 stars 27 forks source link

Homo / heterozygous call issue #49

Closed kim-fehl closed 3 years ago

kim-fehl commented 3 years ago

We noticed the presence of such line in *.vcf result file: chr1 13116 . T G 15.23 PASS P GT:GQ:DP:AD:AF:PL 1/1:15:27:18,9:0.3333:22,27,0

With AD of 18,9 shouldn't the GT be heterozygous (0/1) and not homozygous (1/1) as in output?

Full command from the log:

clair3.sh --bam_fn fk4034.merged.hg19.bam --ref_fn hg19.fna --threads 24 --model_path ont_guppy5 \
--platform ont --output clair3.gvcf --bed_fn=EMPTY --vcf_fn=EMPTY --ctg_name=EMPTY \
--sample_name=SAMPLE --chunk_num=0 --chunk_size=5000000 --samtools=samtools --python=python3 \
--pypy=pypy3 --parallel=parallel --whatshap=whatshap --qual=2 --var_pct_full=0.3 --ref_pct_full=0.1 \
--snp_min_af=0 --indel_min_af=0 --pileup_only=False --gvcf=True --fast_mode=False --call_snp_only=False \
--print_ref_calls=False --haploid_precise=False --haploid_sensitive=False --include_all_ctgs=False \
--no_phasing_for_fa=False --pileup_model_prefix=pileup --fa_model_prefix=full_alignment
aquaskyline commented 3 years ago

Not necessarily if the Ts at the position are coexisting with a certain pattern of mismatches in its adjacency. In ONT reads, there are a few cases that authentic homozygous variant is having sub-0.5 allele frequency. Technically, allele frequency is just a factor to be consider by the Clair3 model, together with the signals in the flanking positions. If the supports for the reference allele are suspicious, they could be ignored, thus leading to a 1/1 decision with a high reference allele count. The variant quality is relatively low in your case, so it could either be a mistake by Clair3 or it's in fact a 1/1.

On Tue, Sep 14, 2021, 11:09 PM kim-fehl @.***> wrote:

We noticed the presence of such line in *.vcf result file: chr1 13116 . T G 15.23 PASS P GT:GQ:DP:AD:AF:PL 1/1:15:27:18,9:0.3333:22,27,0

With AD of 18,9 shouldn't the GT be heterozygous (0/1) and not homozygous (1/1) as in output?

Full command from the log:

clair3.sh --bam_fn fk4034.merged.hg19.bam --ref_fn hg19.fna --threads 24 --model_path ont_guppy5 \ --platform ont --output clair3.gvcf --bed_fn=EMPTY --vcf_fn=EMPTY --ctg_name=EMPTY \ --sample_name=SAMPLE --chunk_num=0 --chunk_size=5000000 --samtools=samtools --python=python3 \ --pypy=pypy3 --parallel=parallel --whatshap=whatshap --qual=2 --var_pct_full=0.3 --ref_pct_full=0.1 \ --snp_min_af=0 --indel_min_af=0 --pileup_only=False --gvcf=True --fast_mode=False --call_snp_only=False \ --print_ref_calls=False --haploid_precise=False --haploid_sensitive=False --include_all_ctgs=False \ --no_phasing_for_fa=False --pileup_model_prefix=pileup --fa_model_prefix=full_alignment

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/HKU-BAL/Clair3/issues/49, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG2SOPYZWGAJTEVWNPPH33UB5QRDANCNFSM5EAMF6ZA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.