Illumina / strelka

Strelka2 germline and somatic small variant caller
GNU General Public License v3.0
355 stars 102 forks source link

All variants being labeled as "LowEVS" #231

Open ElizabethBorden opened 1 year ago

ElizabethBorden commented 1 year ago

Hello,

I am running Strelka2 and nearly all of the variants called are being labeled as "LowEVS" or "LowDepth" and showing 0 or near 0 reads in the tumor sample. However, GATK Mutect2 has detected thousands of variants in these same samples with no issues. Additionally, the samples have around 30M reads with over a 99% mapping rate, so it is unclear why all of these locations would be showing such low coverage? I did reference this question from 2016 (https://github.com/Illumina/strelka/issues/7) but this problem was due to external filtering and the problem I am encountering is from Strelka. Do you have any suggestions on what might be going wrong or what I can try to fix this problem? Here is an example of what the output variants look like:

8 13 65481039 . A T . LowEVS;LowDepth SOMATIC;QSS=7;TQSS=1;NT=ref;QSS_NT=7;TQSS_NT=1;SGT=AA->AA;DP=93;MQ=16.83;MQ0=47;ReadPosRankSum=0.00;SNVSB=0.00;SomaticEVS=0.43 DP: FDP:SDP:SUBDP:AU:CU:GU:TU 35:0:0:0:29,71:0,0:0,0:6,19 0:0:0:0:0,3:0,0:0,0:0,0 6599 13 65481167 . G T . LowEVS;LowDepth SOMATIC;QSS=41;TQSS=2;NT=ref;QSS_NT=41;TQSS_NT=2;SGT=GG->GT;DP=99;MQ=23.30;MQ0=29;ReadPosRankSum=0.00;SNVSB=0.00;SomaticEVS=1.48 DP:FDP:SDP:SUBDP:AU:CU:GU:TU 57:0:0:0:0,0:0,0:50,79:7,9 0:0:0:0:0,0:0,0:0,9:0,2 6600 13 65510327 . T G . LowEVS;LowDepth SOMATIC;QSS=1;TQSS=1;NT=ref;QSS_NT=1;TQSS_NT=1;SGT=TT->TT;DP=41;MQ=40.24;MQ0=3;ReadPosRankSum=0.00;SNVSB=0.00;SomaticEVS=0.79 DP: FDP:SDP:SUBDP:AU:CU:GU:TU 31:0:0:0:0,0:0,0:2,4:29,31 0:0:0:0:0,0:0,0:0,4:0,2

Thank you so much!

-Elizabeth

lauratwomey commented 4 months ago

Hi Elizabeth - I had the same question! Reading a bit more about the Strelka2 methods (see Supp. Note 2 of their paper) seems like it has two hard-coded built-in quality filters LowEVS and LowDepth. So if the variant does not meet the thresholds it is marked as LowEVS, LowDepth or both. You can read more about how the EVS is calculated in their methods/supplementary note 2, but basically if the score is lower than the threshold it does not pass QC standards (the thresholds are 6 and 7 for indels and SNVs, respectively: https://github.com/Illumina/strelka/issues/79). So I would keep only variants marked as PASS, but of course depends on what you are doing (and you might need to readjust the thresholds?). Hope this helps but maybe someone from the team can confirm this:)

Supplementary Strelka2: https://static-content.springer.com/esm/art%3A10.1038%2Fs41592-018-0051-x/MediaObjects/41592_2018_51_MOESM1_ESM.pdf This note also helped, page 29 about Strelka2: https://libstore.ugent.be/fulltxt/RUG01/002/785/131/RUG01-002785131_2019_0001_AC.pdf