Closed cmguodong closed 2 years ago
Hi @cmguodong,
What you understood is not quite correct. Higher QUAL and SomaticQ mean higher likelihood of being true somatic variants.
SomaticQ value of [a] (which is 60) means this variant is highly likely to be somatic (background noise has only 1e-6 probability of generating this variant), but the value of allele tADR=50127,78 (REF has 50127 deduplicated reads, ALT has 78 deduplicated reads) shows that this variant is a variant with low allele fraction.
We can be so confident about this mutation because 1) we have UMI (molecular barcode) information and 2) we have its matched normal which is a technical control.
QUAL and FILTER are defined in the VCF file format specifications at https://samtools.github.io/hts-specs/VCFv4.2.pdf In brief, QUAL is the quality of the variant defined in terms or calling error probability in PHRED scale, and FILTER is the string containing information which can be used to filter out false positive variants. SomaticQ is the somatic variant quality which is identical to QUAL if the calling mode is tumor-normal paired (as indicated by the SOMATIC flag in the vcf INFO).
Please note that QUAL has different meaning depending on whether germline-vs-somatic origin has to be determined. If the origin has to be determined, then QUAL is same as SomaticQ. Otherwise, QUAL is the Phred-scaled error probability that the called variant is neither germline variant nor somatic variant.
thank you for your detailed reply, I get a lot from your info. thanks again ^_^
You are welcome!
Hi @cmguodong It looks like this issue is resolved, right? If it is resolved, then I will close this issue.
Looks like this issue is resolved. I will close this issue.
Hi Zhao,
I have some questions about the result.
how to understand these different values between SRR7757440_SRR7757439_TN.vcf.gz [a] and SRR7757440_uvc1.vcf.gz [b], just like 'QUAL', 'FILTER', especially 'SomaticQ' [a] 3 16306504 . C T 60 PASS SOMATIC;SomaticQ=60;TLODQ=78;NLODQ=60;NLODV=A;TNBQF=251,26,0,19;TNCQF=177,48,0,62;tDP=50250;tADR=50127,78;nDP=58463;nADR=58416,6; [b] 3 16306504 . C T 49 Q50 ANY_VAR;SomaticQ=49;TLODQ=49;NLODQ=57;NLODV=;TNBQF=0,-75,0,4;TNCQF=0,-75,0,49;tDP=50250;tADR=50127,78;nDP=0;nADR=0,0;
(##INFO=<ID=SomaticQ,Number=A,Type=Float,Description="Somatic quality of the variant, the PHRED-scale probability that this variant is not somatic.) SomaticQ value of [a] means this variant is not somatic and has a high PHRED-scale probability (60,1 - 10e-6), but the value of allele ([a] tADR=50127,78 ) shows that this variant may be a low freq mutation. am I misunderstanding these?
thanks