Closed MikeWLloyd closed 8 months ago
Hi, I don't recall the exact reason, but I have this as a guard in the annotate code:
if (':' in alt and ('[' in alt or ']' in alt)) or alt == "<INV>": return
I don't think that's causing the problem since it should pass still. It also has:
if len(snps.starts) == 0: return
but that should also only occur when there are no snps on the entire chromosome of the SV. can you verify that your manta calls (which seem to have no "chr" prefix) match the SNP calls which should also have no "chr" prefix?
@brentp thanks for your quick reply.
Here is a sample of calls out of the VCF I am using:
...
1 3184610 . C G 547.64 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=-0.916;DP=29;ExcessHet=0.0000;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=19.56;ReadPosRankSum=-0.636;SOR=0.646 GT:AD:DP:GQ:PL 0/1:11,17:28:99:555,0,336
1 3185681 . T C 424.64 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=-0.964;DP=31;ExcessHet=0.0000;FS=5.384;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=13.70;ReadPosRankSum=-0.814;SOR=0.690 GT:AD:DP:GQ:PL 0/1:17,14:31:99:432,0,553
1 3186268 . G A 385.64 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=0.860;DP=35;ExcessHet=0.0000;FS=1.414;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=11.69;ReadPosRankSum=-1.327;SOR=1.179 GT:AD:DP:GQ:PL 0/1:20,13:33:99:393,0,656
1 3186863 . C G 423.64 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=0.255;DP=35;ExcessHet=0.0000;FS=1.419;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=13.24;ReadPosRankSum=-0.190;SOR=0.368 GT:AD:DP:GQ:PL 0/1:18,14:32:99:431,0,578
1 3187588 . T C 739.64 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=0.423;DP=36;ExcessHet=0.0000;FS=13.720;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=20.55;ReadPosRankSum=0.095;SOR=1.886 GT:AD:DP:GQ:PL 0/1:17,19:36:99:747,0,657
...
Given that I use the same VCF for all caller annotations, and it works I am not sure that is the culprit.
Here is the Duphold output:
[duphold] finished
I, [2024-02-13T13:34:59] -- duphold: starting to read snps for chrom: 1
I, [2024-02-13T13:35:01] -- duphold: done reading 146555 bi-allelic snps for chrom: 1
I, [2024-02-13T13:35:10] -- duphold: starting to read snps for chrom: 10
I, [2024-02-13T13:35:11] -- duphold: done reading 96453 bi-allelic snps for chrom: 10
I, [2024-02-13T13:35:20] -- duphold: starting to read snps for chrom: 11
I, [2024-02-13T13:35:21] -- duphold: done reading 89973 bi-allelic snps for chrom: 11
I, [2024-02-13T13:35:29] -- duphold: starting to read snps for chrom: 12
I, [2024-02-13T13:35:30] -- duphold: done reading 86783 bi-allelic snps for chrom: 12
I, [2024-02-13T13:35:38] -- duphold: starting to read snps for chrom: 13
I, [2024-02-13T13:35:39] -- duphold: done reading 87591 bi-allelic snps for chrom: 13
I, [2024-02-13T13:35:48] -- duphold: starting to read snps for chrom: 14
I, [2024-02-13T13:35:49] -- duphold: done reading 88234 bi-allelic snps for chrom: 14
I, [2024-02-13T13:35:57] -- duphold: starting to read snps for chrom: 15
I, [2024-02-13T13:35:58] -- duphold: done reading 76587 bi-allelic snps for chrom: 15
I, [2024-02-13T13:36:04] -- duphold: starting to read snps for chrom: 16
I, [2024-02-13T13:36:05] -- duphold: done reading 72114 bi-allelic snps for chrom: 16
I, [2024-02-13T13:36:12] -- duphold: starting to read snps for chrom: 17
I, [2024-02-13T13:36:13] -- duphold: done reading 69169 bi-allelic snps for chrom: 17
I, [2024-02-13T13:36:19] -- duphold: starting to read snps for chrom: 18
I, [2024-02-13T13:36:20] -- duphold: done reading 66871 bi-allelic snps for chrom: 18
I, [2024-02-13T13:36:24] -- duphold: starting to read snps for chrom: 19
I, [2024-02-13T13:36:25] -- duphold: done reading 44727 bi-allelic snps for chrom: 19
I, [2024-02-13T13:36:37] -- duphold: starting to read snps for chrom: 2
I, [2024-02-13T13:36:39] -- duphold: done reading 133578 bi-allelic snps for chrom: 2
I, [2024-02-13T13:36:50] -- duphold: starting to read snps for chrom: 3
I, [2024-02-13T13:36:51] -- duphold: done reading 118319 bi-allelic snps for chrom: 3
I, [2024-02-13T13:37:02] -- duphold: starting to read snps for chrom: 4
I, [2024-02-13T13:37:04] -- duphold: done reading 112797 bi-allelic snps for chrom: 4
I, [2024-02-13T13:37:14] -- duphold: starting to read snps for chrom: 5
I, [2024-02-13T13:37:16] -- duphold: done reading 109705 bi-allelic snps for chrom: 5
I, [2024-02-13T13:37:26] -- duphold: starting to read snps for chrom: 6
I, [2024-02-13T13:37:27] -- duphold: done reading 110427 bi-allelic snps for chrom: 6
I, [2024-02-13T13:37:37] -- duphold: starting to read snps for chrom: 7
I, [2024-02-13T13:37:38] -- duphold: done reading 101950 bi-allelic snps for chrom: 7
I, [2024-02-13T13:37:47] -- duphold: starting to read snps for chrom: 8
I, [2024-02-13T13:37:48] -- duphold: done reading 94650 bi-allelic snps for chrom: 8
I, [2024-02-13T13:37:57] -- duphold: starting to read snps for chrom: 9
I, [2024-02-13T13:37:58] -- duphold: done reading 92288 bi-allelic snps for chrom: 9
So Duphold is reading SNPs. I am wondering if there is something about the format in the manta VCF of the Manta DUP calls specifically. Also, to be honest with you, I was simply curious why it isn't annotating for just the manta dup calls. Reading your prior posts cautioning using the SNP based filter and what you say in the docs, I am not inclined to use DHGT. However, perhaps someone else would do so for Manta.
Actually, it may have to do with how samples are being named in the VCF, BAM, and Manta VCF. I suspect this is a 'me' problem, rather than a code problem. I will close for now, and reopen if needed.
I am trying to add DHGT to Manta, SMOOVE/Lumpy and Delly calls along with the depth based metrics provided by Duphold. I am able to get DHGT to both Smoove/Lumpy and Delly calls. DHSP is added to Smoove, Lumpy and Manta DEL calls.
I am not seeing DHGT added to Manta DUP calls. Is there something obvious I am missing?
Thanks.
MantaDUP:
MantaDEL:
Command:
Tool version: