HKU-BAL / Clair3

Clair3 - Symphonizing pileup and full-alignment for high-performance long-read variant calling
246 stars 27 forks source link

REF=TGTGB #326

Closed minw2828 closed 3 months ago

minw2828 commented 4 months ago

Hello,

I hope you're doing well. Thank you for developing this tool! 😊

There is an unknown locus at (hg38) chr3:16,902,859-16,902,898 labeled as TGTGN, which I've highlighted in yellow in the attached IGV and UCSC screenshots.

Figure1

Figure2

Clair3 has identified this site in the REF field as TGTGB:

gunzip -c tumor.clair3.vcf.gz | grep TGTGB
chr3    16902879    .   TGTGB   T   25.68   PASS    P   GT:GQ:DP:AD:AF  1/1:25:25:2,22:0.88

TGTGB has caused the following error when I tried to load the VCF file into IGV:

Error loading features for interval: chr3:16902858-16902898 htsjdk.tribble.TribbleException: The provided VCF file is malformed at approximately line number 215: unparsable vcf record with allele TGTGB

When I tested with TGTGN, the file loaded successfully.

I just wanted to bring this to your attention.

Many thanks, Min

aquaskyline commented 4 months ago

which clair3 version were you using?

minw2828 commented 4 months ago

This one: hkubal/clair3:latest sha256:89e1ad86d5982a870f423ae29499a8b6eb1551eb88c42834e691facb8c5797fc

aquaskyline commented 4 months ago

Unless you have enabled --keep_iupac_bases, all non ACGT bases will be converted to 'N'. The topic was discussed in #153.

minw2828 commented 3 months ago

Thank you @aquaskyline! 👍