hall-lab / svtyper

Bayesian genotyper for structural variants
MIT License
126 stars 55 forks source link

What's the meaning of QUAL = 0? is there an option to add -q to the Speedseq SV pipeline? #89

Open Sithara85 opened 6 years ago

Sithara85 commented 6 years ago

Hi Colby,

We are using Speedseq 0.1.2 to process the whole genome For ex: NA12877. We are getting many calls with QUAL = 0. Can we just filter them from the vcf file after processing through speedseq workflow? What does the QUAl score or zero or near zero means?

I have seen for speedseq var/ somatic call you have an option to filter the output by QUAL score using -q? Is it applicable in Speedseq SV pipeline?

Thank you, Sithara

cc2qe commented 6 years ago

Including my response from the other thread here

We don't have plans to add a quality filter to the SpeedSeq SV pipeline, but you can do it using awk: cat my.vcf | awk '$0~"^#" || $6>0 { print }'

Hi Sithara,

The QUAL score is meaningful for variants that have been genotyped with SVTyper (which can be done as an option through SpeedSeq). We have used varying thresholds for QUAL in publications (≥100 in the SpeedSeq paper and ≥20 in the GTEx paper), and it is generally appropriate to filter low quality variants from the VCF file. In general, low quality SVs reflect poor alignments to the reference genome. These may be interpreted by LUMPY as non-reference events (which is why they are included in the VCF), but more sensitive inspection by SVTyper finds no convincing evidence of a true SV

Hope that helps, Colby

SunWinner01 commented 1 month ago

Hello, I would like to ask how to set the appropriate threshold of QUAL? My data sequencing depth is 10x. @cc2qe