Closed weishwu closed 7 months ago
Are you using these options? https://github.com/HKU-BAL/Clair3?tab=readme-ov-file#dealing-with-amplicon-data
@aquaskyline
Thanks! I added those options and they rescued 2 out of the 4 SNPs. However, these 2 SNPs appear true in IGV, but were still labelled as RefCall (they are reported only in pileup.vcf.gz
and full_alignment.vcf.gz
but not in merge_output.vcf.gz
):
C412_SP 2296 . G . 16.36 RefCall P GT:GQ:DP:AD:AF 0/0:16:4515:3614:0.8004
C412_SP 2299 . C . 15.97 RefCall P GT:GQ:DP:AD:AF 0/0:15:4552:3724:0.8181
The variant AF values shown in IGV are higher than in the VCF, which I guess was because part of the reads didn't pass the quality threshold.
My command-line:
# clair3 version: 1.0.5
run_clair3.sh \
--bam_fn={input} \
--ref_fn={params.genome_fasta} \
--include_all_ctgs \
--ref_pct_full=1.0 \
--var_pct_full=1.0 \
--no_phasing_for_fa \
--output={params.outdir} \
--threads={threads} \
--platform=ont \
--model_path=r1041_e82_400bps_hac_v410
Could you please show what the records of C412_SP:229 and C412_SP:2299 are like in the full_alignment.vcf.gz
file?
C412_SP 2296 . G . 30.17 RefCall F GT:GQ:DP:AD:AF 0/0:30:11094:7072:0.6375
C412_SP 2299 . C . 31.41 RefCall F GT:GQ:DP:AD:AF 0/0:31:11131:7248:0.6512
Thanks.
Clair3's model has decided with good quality (GQ) that 2296 and 2299 are not a variant, and the reads supporting an alternative allele are more likely to be sequencing or alignment errors.
@aquaskyline OK. Thanks!
These SNPs look true on IGV. Why are they labelled as RefCall by Clair3? The AF values in the output VCF are around 0.5.
Another question: Clair3 is to find germline variants. However, my data is amplicon sequencing and may contain mosaic variants whose frequencies can have a wide range. Can Clair3 identify these variants? I don't have tumor-normal pairs so can't use ClairS.