brentp / duphold

don't get DUP'ed or DEL'ed by your putative SVs.
MIT License
100 stars 9 forks source link

The "REF prefixes differ" followed by Failed to merge alleles #47

Open jjfarrell opened 2 years ago

jjfarrell commented 2 years ago

Any suggestions for this error?

 tail smoove_duphold_ont_manta.sh.o3870689
[smoove] 2022/03/31 11:14:06 [duphold] finished
[smoove] 2022/03/31 11:14:07 [duphold] finished
[smoove] 2022/03/31 11:14:07 [duphold] finished
[smoove] 2022/03/31 11:14:07 [duphold] finished
[smoove] 2022/03/31 11:14:08 [duphold] finished
[smoove] 2022/03/31 11:14:08 starting bcftools merge
[smoove] 2022/03/31 11:35:58 The REF prefixes differ: T vs N (1,1)
Failed to merge alleles at chr22:17756433 in tmp.22/tempclean-539213974/dh132899912smoove-duphold.bcf

Below are the variants at that location. 3 Refs have an N and one has a T which appears to impact the merge step.

chr22   17756433        chr22:17756433:FG       N       <INS:SVSIZE=74:AGGREGATED>
chr22   17756433        chr22:17756433:FG.0     N       <INS:SVSIZE=74:BREAKPOINT1>
chr22   17756433        chr22:17756433:FG.1     N       <INS:SVSIZE=74:BREAKPOINT2>
chr22   17756433        chr22:17756433:OG       T       ]chr2:102886770]T
brentp commented 2 years ago

hmm. that is odd. you could use tiwih setref to set the 'N' to 'T' and see if that helps. Are these manta calls?

jjfarrell commented 2 years ago

I tried it both ways (all N and all T) for just those 4 variants and it worked fine. If I make the variants with a mix of 3 Ts and then an 1 N, it also fails. So I will try tiwih setref on the full vcf to see if that cleans up the issue. FYI, duphold does runs fine if I filter out the BND sites and run it on the other variant types (INS,DEL, DUPs).

The input VCF is the output of graphtyper calls with sites selected by svimmer from primarily manta called vcfs.