Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
456 stars 152 forks source link

Custom bigwig annotation not working for insertion variants #1740

Closed asalimih closed 3 months ago

asalimih commented 3 months ago

Describe the issue

vep doesn't output bigwig custom annotation values for insertion variants.

Additional information

I'm trying to annotate with the conservation scores bigwig file using --custom option but it doesn't output anything for insertion variants. I tried different types overlap, within, surrounding, exact but no change.
The point is if I manually change the insertion variant to a SNV variant at the exact position it will annotate it successfully. this means the problem is not with the position but the variant class.

System

Full VEP command line

./vep \
    --verbose --cache --offline --merged --species homo_sapiens --assembly GRCh38 \
    --use_given_ref \
    --tab \
    --force_overwrite \
    --dir /opt/vep/.vep \
    --dir_plugins /opt/vep/.vep/Plugins \
    --input_file /opt/vep/files/${inputVcf_file} \
    --output_file /opt/vep/files/${output_file} \
    --pick \
    --fasta /opt/vep/.vep/custom/references/Homo_sapiens_assembly38.fasta \
    --variant_class \
    --allele_number \
    --show_ref_allele \
    --total_length \
    --exclude_predicted \
    --fork ${annotationThreads} \
    --custom file=/opt/vep/.vep/custom/phyloP/hg38.phyloP100way.bw,short_name=phyloP100way,format=bigwig,type=overlap,coords=0

I use docker image ensemblorg/ensembl-vep:release_110.1

dglemos commented 3 months ago

Hi @asalimih, Can you please send an example of the input variants and the custom file?

asalimih commented 3 months ago

Hi @asalimih, Can you please send an example of the input variants and the custom file?

Hi @dglemos , Sure example.vcf.gz hg38.phyloP100way.bw there are two insertion variants in this vcf which don't get value.

asalimih commented 3 months ago

@dglemos , could you reproduce the issue?

dglemos commented 3 months ago

Unfortunately I cannot reproduce the issue.

Here is an example of my output using your input file example.vcf:

## VEP command-line: vep --allele_number --assembly GRCh38 --cache_version 112 --custom file=hg38.phyloP100way.bw,short_name=phyloP100way,format=bigwig,type=overlap,coords=0 --database 0 --dir_cache [PATH]/tabixconverted --exclude_predicted --fasta [PATH]/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz --force_overwrite --input_file example.vcf --offline --output_file output.txt --pick --show_ref_allele --tab --total_length --variant_class
#Uploaded_variation     Location        Allele  Gene    Feature Feature_type    Consequence     cDNA_position   CDS_position    Protein_position        Amino_acids     Codons  Existing_variation      ALLELE_NUM      REF_ALLELE      IMPACT  DISTANCE        STRAND  FLAGS   VARIANT_CLASS   SOURCE  phyloP100way
chr6_169663767_T_C      chr6:169663767  C       ENSG00000184465 ENST00000448612 Transcript      intron_variant  -       -       -       -       -       -       1       T       MODIFIER        -       -1      -       SNV     -       -0.84299999475479126
chr5_150203773_T_C      chr5:150203773  C       ENSG00000011083 ENST00000230671 Transcript      synonymous_variant      1460/3625       1194/1911       398/636 D       gaT/gaC -       1       T       LOW     -       1       -       SNV     -       0.707000017166137695
chr1_187744_A_G chr1:187744     G       ENSG00000279457 ENST00000623083 Transcript      intron_variant,non_coding_transcript_variant    -       -       -       -       -       -       1       A       MODIFIER        -       -1      -       SNV     -       -1.55999994277954102
chr1_1757145_T_TGGGGGGGGGG      chr1:1757145-1757146    GGGGGGGGGG      ENSG00000008130 ENST00000341426 Transcript      intron_variant  -       -       -       -       -       -       1       -       MODIFIER        -       -1      -       insertion       -       0.287000000476837158
chr1_1757145_T_G        chr1:1757145    G       ENSG00000008130 ENST00000341426 Transcript      intron_variant  -       -       -       -       -       -       1       T       MODIFIER        -       -1      -       SNV     -       -0.800000011920928955
chr2_219601622_G_GT     chr2:219601622-219601623        T       ENSG00000144589 ENST00000456909 Transcript      intron_variant  -       -       -       -       -       -       1       -       MODIFIER        -       1       -       insertion       -       0.97299998998641967

Can you please run VEP with the following options using the input example.vcf:

./vep \
    --verbose --cache --offline --merged --species homo_sapiens --assembly GRCh38 \
    --use_given_ref \
    --tab \
    --force_overwrite \
    --dir /opt/vep/.vep \
    --dir_plugins /opt/vep/.vep/Plugins \
    --input_file example.vcf \
    --output_file output.txt \
    --fasta /opt/vep/.vep/custom/references/Homo_sapiens_assembly38.fasta \
    --custom file=/opt/vep/.vep/custom/phyloP/hg38.phyloP100way.bw,short_name=phyloP100way,format=bigwig,type=overlap,coords=0
asalimih commented 3 months ago

Can you please run VEP with the following options using the input example.vcf:

./vep \
    --verbose --cache --offline --merged --species homo_sapiens --assembly GRCh38 \
    --use_given_ref \
    --tab \
    --force_overwrite \
    --dir /opt/vep/.vep \
    --dir_plugins /opt/vep/.vep/Plugins \
    --input_file example.vcf \
    --output_file output.txt \
    --fasta /opt/vep/.vep/custom/references/Homo_sapiens_assembly38.fasta \
    --custom file=/opt/vep/.vep/custom/phyloP/hg38.phyloP100way.bw,short_name=phyloP100way,format=bigwig,type=overlap,coords=0

I tried it. still not getting values for deletion variants. To give more information. I'm using docker image ensemblorg/ensembl-vep:release_110.1 . and when I run the code I get the following messages: ‍

Smartmatch is experimental at /opt/vep/src/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
2024-08-19 10:04:50 - Ignored unsupported option 'pluginsdir=/plugins' from environment variable VEP_PLUGINSDIR
2024-08-19 10:04:50 - Ignored unsupported option 'no_htslib=1' from environment variable VEP_NO_HTSLIB
2024-08-19 10:04:50 - Ignored unsupported option 'no_plugins=1' from environment variable VEP_NO_PLUGINS
2024-08-19 10:04:50 - Set 'dir_plugins=/plugins' from environment variable VEP_DIR_PLUGINS
2024-08-19 10:04:50 - Ignored unsupported option 'no_update=1' from environment variable VEP_NO_UPDATE
2024-08-19 10:04:50 - Read configuration from environment variables
2024-08-19 10:04:50 - No input file format specified - detected vcf format
dglemos commented 3 months ago

2024-08-19 10:04:50 - Read configuration from environment variables

Can you share the environment variables?

asalimih commented 3 months ago

Can you share the environment variables?

Sure, here is the output of printenv inside the docker container:

LC_ALL=en_US.UTF-8
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
OPT=/opt/vep
LANG=en_US.UTF-8
KENT_SRC=/opt/vep/src/kent-335_base/src
HOSTNAME=3569d059b675
OPT_SRC=/opt/vep/src
HTSLIB_DIR=/opt/vep/src/htslib
PWD=/opt/vep/.vep
HOME=/opt/vep
PLUGIN_DEPS=https://raw.githubusercontent.com/Ensembl/VEP_plugins/release/110/config
VEP_PLUGINSDIR=/plugins
VEP_NO_UPDATE=1
VEP_NO_HTSLIB=1
TERM=xterm
PERL5LIB=:/opt/vep/src/ensembl-vep:/opt/vep/src/ensembl-vep/modules
SHLVL=1
VEP_NO_PLUGINS=1
DEPS=/opt/vep/src
LANG_VAR=en_US.UTF-8
PATH=/opt/vep/src/ensembl-vep:/opt/vep/src/var_c_code:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
VEP_DIR_PLUGINS=/plugins
PERL5LIB_TMP=:/opt/vep/src/ensembl-vep:/opt/vep/src/ensembl-vep/modules
_=/usr/bin/printenv
dglemos commented 3 months ago

I'm sorry I didn't notice you were using version 110, using this version I can reproduce the issue. The result I sent you previously was run with the latest version 112. Could you please update your vep to use the latest version and test the command again?

asalimih commented 3 months ago

Could you please update your vep to use the latest version and test the command again?

Updating to version 112 solved the issue. Thanks