I am using the complete wf-human-variation pipeline (previously I used dorado 0.6.0) with --sv --snp --mod --phased --cnv options. In sample A1 I know there is a deletion of 405kb (Chr6:162154938-162560714) and a duplication of 509kb (Chr6:162185434-162695422) in 2 different alleles. Keep into account that this 2 SVs have a big overlap. After having added in nextflow config Sniffles2 options (--long-del-length 520000 and --long-dup-length 520000), the 2 SVs are detected but in the same allele (actually I know that he has inherited the del from the father and dup from the mother from different experiment).
sample A1
chr6 162154941 Sniffles2.DEL.1CDFS5 N DEL 60 PASS PRECISE;SVTYPE=DEL;SVLEN=-405774;END=162560715;SUPPORT=13;COVERAGE=27,21,30,29,37;PHASE=1,162561086,13,7,PASS,FAIL GT:GQ:DR:DV:PS 1|0:60:14:13:160822003
chr6 162185435 Sniffles2.DUP.1E81S5 N DUP 60 PASS PRECISE;SVTYPE=DUP;SVLEN=509992;END=162695427;SUPPORT=9;COVERAGE=12,12,19,35,35;PHASE=NULL,NULL,5,5,FAIL,FAIL GT:GQ:DR:DV:PS 1|0:28:22:9:160822003
Since the two SVs are in two different alleles, I was expecting 1|0 for one and 0|1 for the other.
In addition, I am also wondering how it assigns phasing in the duplication in A1 where PHASE=NULL,NULL,5,5,FAIL,FAIL
Therefore, in sample H1 (healthy, has only the deletion) it detects the deletion but it's not able to phase:
chr6 162154941 Sniffles2.DEL.1403S5 N DEL 60 PASS PRECISE;SVTYPE=DEL;SVLEN=-405774;END=162560715;SUPPORT=9;
PHASE=NULL,NULL,3,3,FAIL,FAIL GT:GQ:DR:DV:PS 1/1:25:0:9:.
In sample A2 (I expect same del/dup found in A1) it detects:
chr6 162154943 Sniffles2.DEL.20EBS5 N DEL 60 PASS PRECISE;SVTYPE=DEL;SVLEN=-405774;END=162560717;SUPPORT=5;PHASE=1,NULL,3,2,PASS,FAIL GT:GQ:DR:DV:PS 1|0:24:9:5:161792994
chr6 162185435 Sniffles2.DUP.22A5S5 N DUP 60 PASS PRECISE;SVTYPE=DUP;SVLEN=509992;END=162695427;SUPPORT=10;PHASE=2,162429564,7,7,PASS,PASS GT:GQ:DR:DV:PS 0/1:60:12:10:.
The tool is able to phase the deletion of sample A2 (but the VCF PHASE field contains NULL and FAIL values) while it is not able to phase the duplication of sample A2 where the VCF PHASE field contains values. Why this happens?
I have performed phasing both with longphase and whatshap with same results.
Here below IGV screenshots (color by HP) of beginning and end coordinates of del and dup for A1/A2 samples with both del and dup and H1 sample with only del.
Hello @agatafant if you are using a version of the workflow >=2.2.0, then the phasing is done internally by sniffles2 so this question is probably better asked to the developers directly.
Ask away!
I am using the complete wf-human-variation pipeline (previously I used dorado 0.6.0) with --sv --snp --mod --phased --cnv options. In sample A1 I know there is a deletion of 405kb (Chr6:162154938-162560714) and a duplication of 509kb (Chr6:162185434-162695422) in 2 different alleles. Keep into account that this 2 SVs have a big overlap. After having added in nextflow config Sniffles2 options (--long-del-length 520000 and --long-dup-length 520000), the 2 SVs are detected but in the same allele (actually I know that he has inherited the del from the father and dup from the mother from different experiment). sample A1
Since the two SVs are in two different alleles, I was expecting 1|0 for one and 0|1 for the other.
In addition, I am also wondering how it assigns phasing in the duplication in A1 where PHASE=NULL,NULL,5,5,FAIL,FAIL
Therefore, in sample H1 (healthy, has only the deletion) it detects the deletion but it's not able to phase: chr6 162154941 Sniffles2.DEL.1403S5 N DEL 60 PASS PRECISE;SVTYPE=DEL;SVLEN=-405774;END=162560715;SUPPORT=9; PHASE=NULL,NULL,3,3,FAIL,FAIL GT:GQ:DR:DV:PS 1/1:25:0:9:.
In sample A2 (I expect same del/dup found in A1) it detects:
The tool is able to phase the deletion of sample A2 (but the VCF PHASE field contains NULL and FAIL values) while it is not able to phase the duplication of sample A2 where the VCF PHASE field contains values. Why this happens?
I have performed phasing both with longphase and whatshap with same results.
Here below IGV screenshots (color by HP) of beginning and end coordinates of del and dup for A1/A2 samples with both del and dup and H1 sample with only del.