Open Sithara85 opened 6 years ago
Including my response from the other thread here
We don't have plans to add a quality filter to the SpeedSeq SV pipeline, but you can do it using awk:
cat my.vcf | awk '$0~"^#" || $6>0 { print }'
Hi Sithara,
The QUAL score is meaningful for variants that have been genotyped with SVTyper (which can be done as an option through SpeedSeq). We have used varying thresholds for QUAL in publications (≥100 in the SpeedSeq paper and ≥20 in the GTEx paper), and it is generally appropriate to filter low quality variants from the VCF file. In general, low quality SVs reflect poor alignments to the reference genome. These may be interpreted by LUMPY as non-reference events (which is why they are included in the VCF), but more sensitive inspection by SVTyper finds no convincing evidence of a true SV
Hope that helps, Colby
Hello, I would like to ask how to set the appropriate threshold of QUAL? My data sequencing depth is 10x. @cc2qe
Hi Colby,
We are using Speedseq 0.1.2 to process the whole genome For ex: NA12877. We are getting many calls with QUAL = 0. Can we just filter them from the vcf file after processing through speedseq workflow? What does the QUAl score or zero or near zero means?
I have seen for speedseq var/ somatic call you have an option to filter the output by QUAL score using -q? Is it applicable in Speedseq SV pipeline?
Thank you, Sithara