samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
680 stars 240 forks source link

bcftools annotate - tag ID not defined #2302

Open ValentinaPeona opened 1 month ago

ValentinaPeona commented 1 month ago

Hi! I am having a problem annotating a VCF file of structural variants. I just want to add tags to the INFO field by ID but I get this error:

bcftools annotate -a ${ANNOT_FILE}.gz -h ${HDR_FILE} -c CHROM,POS,~ID,REF,ALT,INFO/n_hits,INFO/fragmts,INFO/match_lengths,INFO/repeat_ids,INFO/matching_classes,INFO/RM_hit_strands,INFO/RM_hit_IDs,INFO/total_match_length,INFO/total_match_span ${ANNOT_FILE}.genotypes.sorted.vcf 
The tag "~ID" is not defined in ${ANNOT_FILE}.gz

${ANNOT_FILE}.gz looks like this

chr1    641270  oenPle.INS.73S0 A       TTTGTGATATAACTAAAGCCAATTCCAATGCCCCATTTTCCTCATAAAAATTAAAAACAAGC  1       20       SINE       +      2     3    20    20

and it was created as follows together with the header:

bgzip ${ANNOT_FILE}
tabix -s1 -b2 -e2 ${ANNOT_FILE}.gz

HDR_FILE=${ANNOT_FILE}.header

echo -e '##INFO=<ID=n_hits,Number=1,Type=Integer,Description="Number of repeats found in insertion">' >> ${HDR_FILE}
echo -e '##INFO=<ID=match_lengths,Number=.,Type=Integer,Description="Insertion lengths spanned by each repeat">' >> ${HDR_FILE}
echo -e '##INFO=<ID=repeat_ids,Number=.,Type=String,Description="Repeat family IDs">' >> ${HDR_FILE}
echo -e '##INFO=<ID=matching_classes,Number=.,Type=String,Description="Repeat class names">' >> ${HDR_FILE}
echo -e '##INFO=<ID=fragmts,Number=.,Type=Integer,Description="Number of fragments merged into one by one code">' >> ${HDR_FILE}
echo -e '##INFO=<ID=RM_hit_strands,Number=.,Type=String,Description="RepeatMasker hit strands">' >> ${HDR_FILE}
echo -e '##INFO=<ID=RM_hit_IDs,Number=.,Type=String,Description="RepeatMasker hit IDs">' >> ${HDR_FILE}
echo -e '##INFO=<ID=total_match_length,Number=1,Type=Integer,Description="Insertion length spanned by repeats">' >> ${HDR_FILE}
echo -e '##INFO=<ID=total_match_span,Number=1,Type=Float,Description="Insertion span spanned by repeats">' >> ${HDR_FILE}
echo -e '##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">' >> ${HDR_FILE}

How can I solve the error?

pd3 commented 1 month ago

This looks similar to https://github.com/samtools/bcftools/issues/2297; which version of bcftools are you running?

ValentinaPeona commented 1 month ago

The version I'm using is 1.16

pd3 commented 1 month ago

Please update to the latest, 1.21, then -c ...,-ID,... should work.