Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
456 stars 151 forks source link

Multiple CSQ fields when re-annotating a VCF with --keep_csq #1780

Closed TimD1 closed 1 day ago

TimD1 commented 4 weeks ago

Describe the issue

When using the --keep_csq flag and running VEP on a VCF that was previously annotated with VEP, a second CSQ field is added to the header, and each variant has a second INFO/CSQ field. It would be nice if when running VEP with the --keep_csq flag, new annotations are appended to the existing CSQ field, instead of adding a second field.

Additional information

This is the same as issue #134 but it occurs when using the --keep_csq flag.

System

Existing Behavior

##fileformat=VCFv4.0
##VEP="v111" time="2018-01-29 09:59:30" ...
##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: FIELD1|FIELD2">
##VEP="v111" time="2018-01-29 09:59:42" ...
##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: FIELD1|FIELD3">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
chr1 100 . G C 30 CSQ=DATA1|DATA2;CSQ=DATA1|DATA3"

Proposed Behavior

##fileformat=VCFv4.0
##VEP="v111" time="2018-01-29 09:59:30" ...
##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: FIELD1|FIELD2|FIELD3">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
chr1 100 . G C 30 CSQ=DATA1|DATA2|DATA3"
dglemos commented 4 weeks ago

Hi @TimD1, This is the expected behaviour. The current documentation does not clearly explain how the flag works, but we plan to clarify this in a future release.

Best wishes, Diana

dglemos commented 1 day ago

I'm going to close this issue. If you have more questions feel free to open a new one.

Best wishes, Diana