Closed wdecoster closed 4 years ago
Hi Wouter,
thanks for reporting this issue. I have observed a similar issue when SVIM erroneously outputs a VCF record with POS=0 (although the VCF spec require POS to be greater than 0). For some reason, bcftools
replaces these wrong POS fields with values of 2^32. I have already fixed the underlying issue causing POS=0 in the output VCF with the following commit: 3c8915a8d731df2370fbfbd5242c02576fbb118e.
Can you confirm that the original VCF output from SVIM contains POS fields with 0 instead of 2^32? If this is the case, could you please reprocess your sample with the current master of SVIM instead of v1.4.1? If this fixes the issue, I can upload the current master as v1.4.2 to pypi and bioconda.
Cheers David
Hi David,
I can confirm this happens with bcftools sort
but not with bcftools view
:
diff -y --suppress-common-lines <(cat variants.vcf) <(bcftools view variants.vcf) | grep 4294967296 # returns nothing
diff -y --suppress-common-lines <(cat variants.vcf) <(bcftools sort variants.vcf) | grep 4294967296
Writing to /tmp/bcftools-sort.IChFRc
Merging 1 temporary files
Cleaning
Done
chr5_GL000208v1_random 0 svim.BND.130634 N ]chr5 | chr5_GL000208v1_random 4294967296 svim.BND.130634 N
chr17_KI270729v1_random 0 svim.BND.311465 N [chrX | chr17_KI270729v1_random 4294967296 svim.BND.311465 N
chr22_KI270736v1_random 0 svim.BND.346285 N ]chr2 | chr22_KI270736v1_random 4294967296 svim.BND.346285 N
chrEBV 0 svim.DUP_TANDEM.6676 N <DUP:TANDEM> | chrEBV 4294967296 svim.DUP_TANDEM.6676 N <DUP:
chrUn_GL000226v1 0 svim.DUP_TANDEM.6818 N | chrUn_GL000226v1 4294967296 svim.DUP_TANDEM.6818
chrUn_KI270435v1 0 svim.BND.347865 N ]chr1 | chrUn_KI270435v1 4294967296 svim.BND.347866 N
chrUn_KI270435v1 0 svim.BND.347866 N [chrY | chrUn_KI270435v1 4294967296 svim.BND.347865 N
chrUn_KI270590v1 0 svim.DUP_TANDEM.6971 N | chrUn_KI270590v1 4294967296 svim.DUP_TANDEM.6971
I'll install it from git and report back to you.
Cheers, Wouter
The version on GitHub seems to be okay!
Thanks a lot, Wouter, for checking and sorry for the hassle. I just released SVIM v1.4.2 so this bug should be fixed also on bioconda soon.
Cheers, David
I've merged the changes, thanks again! https://github.com/bioconda/bioconda-recipes/pull/24743
Hi,
I'm using SVIM v1.4.1, and I notice that for some of the random contigs and unplaced contigs (e.g. chr5_GL000208v1_random, chrUn_GL000226v1) I get surprisingly high coordinates, suspiciously always 4294967296 or 2^32. Building a normal tbi tabix index breaks for variants like that.
Below are some examples, I can share the full VCF for this sample if you want:
I checked the length of chr5_GL000208v1_random and that's only 92kb. So something is off here :)
Cheers, Wouter