When I merge vcfs generated using --refcall POSITIONAL and -P 2, I get triallelic genotype calls (2|0, 0|2) in the resulting merged vcf file that are not present in the individual vcf files. I think is has to do with how bcftools treats "" in the ALT column. Is there a simple way to change the to . so that bcftools identifies the ALT as a reference call and not an alternative allele?
For example, in the single .vcf, this is the result:
MTB_anc 1143 . G A
1|0:96:21:37:1104:100:15,6:0.714,0.286:0.214,0.214:0.363:41,32:21:PASS
In the merged .vcf, this is the result for the same sample (first) and the other two that have been merged, which are both homozygous reference:
MTB_anc 1143 . G *,A
2|0:96:21:37:15,.,6:0.714,.,0.286:0.214,.,0.214:0.363:41,.,32:21:PASS:1104:100
0|0:177:59:37:59,.,.:1,.,.:0,.,.:.:41,.,.:59:PASS:.:.
0|0:75:25:37:25,.,.:1,.,.:0,.,.:.:41,.,.:25:PASS:.:.
Attached is a screenshot of the same three samples in IGV for that position. You can see the calls should be 0|1, 0|0, 0|0
This will become a problem for me at sites that are actually triallelic. Thank you!
Version
$ octopus --version
octopus version 0.7.4
Target: x86_64 Linux 5.10.25-linuxkit
SIMD extension: AVX2
Compiler: GNU 11.1.0
Boost: 1_76
Describe the bug
When I merge vcfs generated using --refcall POSITIONAL and -P 2, I get triallelic genotype calls (2|0, 0|2) in the resulting merged vcf file that are not present in the individual vcf files. I think is has to do with how bcftools treats "" in the ALT column. Is there a simple way to change the to . so that bcftools identifies the ALT as a reference call and not an alternative allele?
For example, in the single .vcf, this is the result:
MTB_anc 1143 . G A 1|0:96:21:37:1104:100:15,6:0.714,0.286:0.214,0.214:0.363:41,32:21:PASS
In the merged .vcf, this is the result for the same sample (first) and the other two that have been merged, which are both homozygous reference:
MTB_anc 1143 . G *,A 2|0:96:21:37:15,.,6:0.714,.,0.286:0.214,.,0.214:0.363:41,.,32:21:PASS:1104:100 0|0:177:59:37:59,.,.:1,.,.:0,.,.:.:41,.,.:59:PASS:.:. 0|0:75:25:37:25,.,.:1,.,.:0,.,.:.:41,.,.:25:PASS:.:.
Attached is a screenshot of the same three samples in IGV for that position. You can see the calls should be 0|1, 0|0, 0|0
This will become a problem for me at sites that are actually triallelic. Thank you!
Version
Command Command line to install octopus:
Command line to run octopus: