HKU-BAL / Clair3

Clair3 - Symphonizing pileup and full-alignment for high-performance long-read variant calling
246 stars 27 forks source link

Missense variant in FBN2 gene not reported by Clair3 #287

Closed tsneddon closed 7 months ago

tsneddon commented 8 months ago

Hi: We just ran Clair3 on our first human ONP long-read data. In comparing SNV calls that we identified in the same genome with Illumina short-reads, we found a missense variant that did not make it into the Clair3 vcf file:

fileformat=VCFv4.2

source=Clair3

clair3_version=1.0.4

cmdline=/nas/longleaf/rhel8/apps/clair3/1.0.4/Clair3/run_clair3.sh --bam_fn=FBA-0005LR_merged.bam --ref_fn=../Homo_sapiens_assembly38.fasta --threads=40 --platform=ont --model_path=./r1041_e82_400bps_hac_v420 --output=./clair3_results/

reference=/proj/barc/projects/GENYSIS_Nov2023/FBA-0005LR/../Homo_sapiens_assembly38.fasta

FILTER=

FILTER=

FILTER=

INFO=

INFO=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

Looking at the alignment in IGV, it seems like it should have been called:

Screenshot 2024-03-27 at 10 27 53 AM

Any thoughts on why it is not in the Clair3 output file? Thanks!

aquaskyline commented 8 months ago

It's a bit hard to decide due to strand bias, among the four Ts, four are on positive strand and none on negative strand.

tsneddon commented 8 months ago

OK, thanks, does Clair3 have a default cut-off to filter based on strand bias? If so, how do we override it?

aquaskyline commented 7 months ago

Clair3 handles strand bias in its neural network, it does not use a hard filter. Our observation is that Clair3 loosens the cutoff when coverage is low but all other supports are good. However, extreme cases like having all reads on one strand are mostly filtered.

tsneddon commented 7 months ago

Thank you for the explanation!