samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
662 stars 240 forks source link

Segfault when vcf annotate uses FORMAT/GT as a column #1477

Open cvaske opened 3 years ago

cvaske commented 3 years ago

When running a command like

bcftools annotate -o annotated.bcf -s sample1 -a annotation.tsv.gz -c CHROM,POS,FORMAT/GT source.vcf.gz

bcftools will segfault. To reproduce, use these inputs, where source.vcf.gz consists of

##fileDate=20210429
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=c1>
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  sample1
c1  123 .   G   T   .   .   .   GT  .

and annotation.tsz.gz is the BGZipped and tabix indexed version of

c1  123 0/1

The segfault appears to happen at line 1189 of vcfannotate.c https://github.com/samtools/bcftools/blob/a865a16944317e0fb310eb59b82b274ad945e868/vcfannotate.c#L1189 where an index of 1 is used instead of 0 on the file readers array, when there is only a single item.

However, changing that to fix the segfault still does not result in setting the genotype in the output, but that may be a separate issue.

pd3 commented 3 years ago

Ah, this is because the program does not support transfer of GTs from a text file into a VCF, sorry! I added a check for this to exit gracefully with an informative error message. For now, can you create a VCF and use that to transfer the annotations from?