Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
456 stars 152 forks source link

Problem with VEP query stuck without results (without offline mode) #1806

Open Taalouane opened 4 days ago

Taalouane commented 4 days ago

Hello,

I am contacting you regarding an issue encountered while using VEP to query the Ensembl database.

The query remains stuck, with no results returned and no error message. The process seems to run indefinitely without reaching a final result.

The query was working fine for me previously, and several times. Additionally, I would like to mention that the MySQL connection to the server is functioning normally, as shown by the following command:

mysql --host=ensembldb.ensembl.org --user=anonymous The connection establishes without any issue and gives me access to the database, confirming that the connection to the server is working. However, the query via VEP remains stuck.

I would appreciate your expertise in identifying whether this problem could be related to a recent configuration change in VEP, or if other factors could explain this behavior after the system restart.

Thank you in advance for your help.

                   singularity exec  \ 
                    $path_SIF/VEPv109.3.sif /opt/vep/src/ensembl-vep/vep \
                 --input_file  ${sample_name}.txt \
                 --output_file  ${sample_name}.vcf \
                 --format hgvs \
                 --cache \
                 --dir_cache $path_cache_dir \
                 --vcf --fork 8 \
                 --buffer_size 1000 \
                 --assembly GRCh38 \
                 --total_length \
                 --no_stats \
                 --sift b \
                 --polyphen b \
                 --hgvs \
                 --hgvsg \
                 --symbol \
                 --no_escape \
                 --numbers \
                 --domains \
                 --regulatory \
                 --protein \
                 --biotype \
                 --variant_class \
                 --check_existing \
                 --pubmed \
                 --canonical \
                 --af \
                 --af_1kg \
                 --dir_plugins $path_plugins_dir \
                 --pick \
                 --quiet  \
                 --force_overwrite \
                 --plugin dbNSFP,$path_resources/dbNSFP_V2a/dbNSFP4.2a.txt.gz,GTEx_V8_tissue,GTEx_V8_gene,MetaLR_score,MetaLR_pred,MetaRNN_score,MetaRNN_pred,MutationTaster_score,MutationTaster_pred,FATHMM_score,FATHMM_pred,PROVEAN_score,PROVEAN_pred,MetaSVM_score,MetaSVM_pred,PrimateAI_score,PrimateAI_pred,ClinPred_score,ClinPred_pred \
                 --plugin ExACpLI \
                 --plugin REVEL,$path_resources/REVEL_v1.3/new_tabbed_revel_grch38.tsv.gz \
                 --plugin ExAC,$path_resources/ExAC_V0.3/ExAC.0.3.GRCh38.vcf.gz \
                 --plugin CADD,$path_resources/CADD_v1.6/whole_genome_SNVs.tsv.gz,${path_resources}/CADD_v1.6/gnomad.genomes.r3.0.indel.tsv.gz \
                 --plugin SpliceAI,snv=$path_resources/SpliceAI_scores_v1.3/spliceai_scores.raw.snv.hg38.vcf.gz,indel=$path_resources/SpliceAI_scores_v1.3/spliceai_scores.raw.indel.hg38.vcf.gz \
                 --custom $path_resources/gnomADg_V3.1.1/gnomad.genomes.v3.1.2.sites.vcf.gz,gnomAD_genomes,vcf,exact,0,AF,popmax,AF_popmax,AF_XX,AF_XY,AF_oth,AF_ami,AF_sas,AF_fin,AF_eas,AF_afr,AF_asj,AF_amr,AF_mid,AF_nfe,nhomalt \
                 --custom $path_resources/gnomADe_V2.1.1/gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf.bgz,gnomAD_exomes,vcf,exact,0,AF,popmax,AF_popmax,AF_female,AF_male,AF_afr,AF_amr,AF_asj,AF_eas,AF_fin,AF_nfe,AF_oth,AF_sas,nhomalt \
                 --custom $path_resources/clinvar_v20230702/clinvar_20230702.vcf.gz,CLINVAR,vcf,exact,0,ALLELEID,CLNSIG,CLNDN,CLNREVSTAT,CLNDISDB \
                 --custom $path_resources/PhyloP/hg38.phyloP100way.bw,phylop100verts,bigwig,exact,0 \
                 --custom $path_resources/PhastCons/hg38.phastCons100way.bw,phastcons100verts,bigwig,exact,0 \
                 --custom $path_resources/PhyloP/hg38.phyloP30way.bw,phylop30mams,bigwig,exact,0 \
                 --custom $path_resources/PhastCons/hg38.phastCons30way.bw,phastcons30mams,bigwig,exact,0 \
                 --custom $path_resources/PhyloP/hg38.phyloP17way.bw,phyloP17primates,bigwig,exact,0 \
                 --custom $path_resources/PhastCons/hg38.phastCons17way.bw,phastcons17primates,bigwig,exact,0 \
                 2> VEP.log
dglemos commented 4 days ago

Hi @Taalouane, The option --cache still connects to the database which slows down VEP. Can you please use --offline instead?

Also, to run in offline mode with hgvs options you should provide a fasta file with --fasta (documentation) We also recommend you download the indexed cache files. You can find them here for the current version 113: https://ftp.ensembl.org/pub/current_variation/indexed_vep_cache/

dglemos commented 4 days ago

Another reason for the query being slower than usual is ongoing issues with the MySQL connection, which could impact job performance. If your input file is supported offline, I would recommend you always use --offline instead of --cache.

Taalouane commented 4 days ago

It is not possible to use offline mode for input with HGVS format, it's not a VCF, it's a list of HGVS variants. MSG: ERROR: Cannot use HGVS format in offline mode I added the FASTA file, but the problem persists.

The cache file is indexed, and I have it for the same version of VEP: v109. Do you think I should switch to the new version of VEP and indexed cache?

Taalouane commented 4 days ago

Additional information: My command was working without any issues before. Is there any problem or maintenance on your server side?

dglemos commented 4 days ago

It is not possible to use offline mode for input with HGVS format, it's not a VCF, it's a list of HGVS variants.

For HGVS input you have to keep running --cache which establish a connection to the database.

Do you think I should switch to the new version of VEP and indexed cache?

We recommend everyone to use the latest vep code and indexed cache.

Is there any problem or maintenance on your server side?

There are a few ongoing issues with the MySQL connection, which could impact job performance. Please let us know if they persist in the next days.

Taalouane commented 4 days ago

Thank you @dglemos for your response. I will get back to you.

Could you please forward the information regarding the slow connection to your team responsible for the MySQL server?

Thank you again.