POS — position: The reference position, with the 1st base having position 1. ... Telomeres are indicated by using positions 0 or N+1, where N is the length of the corresponding chromosome or contig. (Integer, Required)
If a variant's POS value contains a string that doesn't encode an integer:
$ cat example.vcf
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="All filters passed">
##fileDate=20150218
##reference=ftp://ftp.1000genomes.ebi.ac.uk//vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz
##source=1000GenomesPhase3Pipeline
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=1,assembly=b37,length=51304566>
##SAMPLE=<ID=SAMPLE_1>
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE_1
1 stuff ID1 G T . PASS . GT 1/0
1 1.5 ID2 G T . PASS . GT 1/0
BCFtools silently ignores the issue and parses these invalid positions into valid ones:
$ ./bcftools view -H example.vcf
1 0 ID1 G T . PASS . GT 1/0
1 1 ID2 G T . PASS . GT 1/0
$ echo $?
0
It would be a nice to have for BCFtools to fail upon detection of invalid POS values. For example, column swapping would get detected, and not potentially result in a valid VCF full of 0positions! A valid argument against doing anything about this is GIGO. However, BCFtools certainly validates some things - and after doing an upgrade, I noticed the number of checks are increasing.
According to the VCF specification, POS must be an integer:
If a variant's
POS
value contains a string that doesn't encode an integer:BCFtools silently ignores the issue and parses these invalid positions into valid ones:
This happens because
hts_str2int
doesn't have any checks for non-int values: https://github.com/samtools/htslib/blob/008eabd/textutils_internal.h#L218 (which I think mirrorsstrtol
's behaviour).It would be a nice to have for BCFtools to fail upon detection of invalid POS values. For example, column swapping would get detected, and not potentially result in a valid VCF full of
0
positions! A valid argument against doing anything about this is GIGO. However, BCFtools certainly validates some things - and after doing an upgrade, I noticed the number of checks are increasing.