Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
456 stars 152 forks source link

issue specifiying cache dir (-d) and downloading files #1719

Closed errcricket closed 4 months ago

errcricket commented 4 months ago

Greetings and thank you for developing this tool.

I am trying to do a fresh install of VEP (version 112, perl 5.26.3) following these instructions. Specifically is says:

By default VEP installs cache files in a folder in your home area ($HOME/.vep); you can easily change this using the -d flag when running the installer.

I am running VEP on a Linux server (CentOS) at work were we are very limited in the amount of space we have in our home directory, and instead are required to use mounted drives for storage.

I ran the following command:

perl INSTALL.pl -d /x/xx/xxx/xxxx/installed_packages/vep_cache/

The first time I tried installing VEP, all of the items going into the Bio directory were downloaded under ensembl-vep, but this time Bio was placed under ....../installed_packages/vep_cache/Bio. When it came to storing cache files, the message listed my home directory as the installation location instead of the -d location, and it timed out when trying to download files.

All tests successful.
Files=49, Tests=1966, 98 wallclock secs ( 0.48 usr  0.16 sys + 86.19 cusr  6.19 csys = 93.02 CPU) 
Result: PASS                                                                                                                                                                                                                                 - OK!                                                                                                                                                                                                                                      

The VEP can either connect to remote or local databases, or use local cache files. 
Using local cache files is the fastest and most efficient way to run the VEP 
Cache files will be stored in /home/my_username/.vep             
Do you want to install any cache files (y/n)? y  
getting list of available cache files
ERROR: Could not connect to FTP host ftp.ensembl.org
Connection timed out                                       

Can you provide some guidance with an example of how I can ensure the cache location is the one listed with -d and what to do with the timeout issue please?

Many many (many) thanks in advance.

p.s. tabix is installed.

dglemos commented 4 months ago

Hi @errcricket, The option -d or --DESTDIR will install the API modules in the specified directory. To specify a different cache directory you should use the option -c or --CACHEDIR. Here you can see a list of all the installation options: https://www.ensembl.org/info/docs/tools/vep/script/vep_download.html#installer

As an alternative, you can manually download cache files to your directory. See here for more details: https://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#cache

Let me know if you have more questions.

Best wishes, Diana

errcricket commented 4 months ago

Thank you @dglemos, that resolved all of my issues. Thank you!