brentp / smoove

structural variant calling and genotyping with existing tools, but, smoothly.
Apache License 2.0
222 stars 21 forks source link

IMPRECISE Confuse #194

Closed cfz1998 closed 2 years ago

cfz1998 commented 2 years ago

Hi!@brentp.

  1. My sv files go through slivar annotations. For the annotated files, for DEL and DUP, I only keep those loci that have HQHET and HQHA samples, although this locus may also have LQHET. For other variant types, I only keep sites with MSHQ >= 3. I don't know if this method is possible.
  2. I found that there are still many INFO "IMPRECISE" sites after filtering in the above way. Should these sites be retained or removed? Or should it be discarded as long as it is marked as "IMPRECISE"? Like this? <chr1A_part1 141759 2120698_1 N [chr1A_part1:141900[N 233.8 .SVTYPE=BND;STRANDS=--:6;IMPRECISE;CIPOS=0,29;CIEND=-739,29;CIPOS95=0,9;CIEND95=-153,1 ;MATEID=2120698_2;EVENT=2120698;SU=6;PE=6;SR=0;SNAME=WATDE0978:1_1;ALG=PROD;GCF=0.514286;AN=1476;AC=3;MSHQ=4> Or this? <chr1A_part1 3418221 205622 N 1477.1 . SVTYPE=DEL;SVLEN=-252;END=3418473;STRANDS=+-:11;IMPRECISE;CIPOS=-49,248;CIEND=-271,49;CIPOS95=-49,94; CIEND95=-121,49;SU=11;PE=11;SR=0;SNAME=Chinese_Spring:11,landmark:25;ALG=PROD;GCF=0;AN=1916;AC=1055;MSHQ=3.9746;HQHET =so many samples,HQHA=so many samples.> NO LQHET! But is "IMPRECIS".

Thanks! --zcf

brentp commented 2 years ago

Hi, it's better to filter on duphold annotations. That's ok if sites are IMPRECISE, that usually means that they don't have any split reads, but they can still be real.

cfz1998 commented 2 years ago

Hi, it's better to filter on duphold annotations. That's ok if sites are IMPRECISE, that usually means that they don't have any split reads, but they can still be real.

Thank you! brentp. It‘s cool!