HKU-BAL / Clair3

Clair3 - Symphonizing pileup and full-alignment for high-performance long-read variant calling
247 stars 27 forks source link

Variants not reported in merge_output. Why? #229

Closed SkabbiVML closed 1 year ago

SkabbiVML commented 1 year ago

Hello developers!

I'm running Clair3 on WGS data from tumor biopsies. I'm a but confused to how variants are reported in the merge_output.vcf file. In the example below, two variants can be seen with AF of around 35 %. Both are reported in the pileup.vcf file, not filtered out. Both appear in the full_alignment.vcf file but are filtered out as - RefCall. Neither appears in the merge_output file.

Why are they filtered from the full alignment if the AF is 35%? Why are they not included in the merge_output file? Clair3 was run with a reference genome and clair3 model, everything else was default. Are there settings that I need set to get these reported? Should I perhaps only be looking at the pileup file and ignore the full_alignment?

Both variants have been confirmed by illumina sequencing

image

Thanks for you help

Skabbi

aquaskyline commented 1 year ago

Is this a known true variant? The reads that have C on the left are all without G on the right. It is not 100% but can be a strong indication of false positive variants.

SkabbiVML commented 1 year ago

Hi @aquaskyline

Both of these variants have been previously reported in this sample by illumina. The variants separate completely by phasing.

image

aquaskyline commented 1 year ago

If that's the case. It's likely that Clair3 full-alignment model fails to call the variant. Clair3 was designed primarily for germline variant calling, and the full-alignment model is more stringent than the pileup model in searching for germline variant signals. The 0.35 AF deviates from the heterozygous AF 0.5 and might be the reason why Clair3 full-alignment model fails to call the variant. If you want to use Clair3 for somatic variant calling, for maximum sensitivity, you might want to use only the pileup calls.

SkabbiVML commented 1 year ago

Thanks for clearing that up for me @aquaskyline, very informative. I thought the minimal AF threshold was set with --snp_min_af but I guess that only applies if the variant passes the full-alignment criteria.

Cheers

S