PharmGKB / pgkb-ngs-pipeline

PharmGKB NGS Pipeline
Mozilla Public License 2.0
17 stars 6 forks source link

curl of hg38bundle.tar.gz doesn't work #2

Open rgiannico opened 5 years ago

rgiannico commented 5 years ago

Hi Mark, I'm so sorry to bother you again, but I just found this command doesn't work

$ curl -o external_data/grc38.tar.gz ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/hg38/hg38bundle.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
curl: (78) RETR response: 550

And it's because the file is no more available if you look at the ftp site: ftp://ftp.broadinstitute.org/bundle/hg38/

This of course affects both the Dockerfile and the standard installation instructions. Do you have any idea how to solve this?

ProzacR commented 5 years ago

I tried to install it and face exactly same issue.

whaleyr commented 5 years ago

The Broad Institute has reorganized their FTP server and we're having a hard time finding the equivalent file to the one used in the script.

If you find the equivalent file please post the URL here and we'll update the script. Otherwise, we post a fix if/when we can find it ourselves.

mwaldron104 commented 4 years ago

I have a researcher that wants to use this tool, but there's still this issue of the hg38bundle.tar.gz file not being available. Has no one been able to locate it in a year? I sent a message to the Broad Institute, but received no reply.

wbgalvao commented 4 years ago

I have a researcher that wants to use this tool, but there's still this issue of the hg38bundle.tar.gz file not being available. Has no one been able to locate it in a year? I sent a message to the Broad Institute, but received no reply.

For anyone running into this problem, I was able to download the reference files from Broad's hg38 bundle connecting to their ftp server (ftp.broadinstitute.org). You can use gsapubftp-anonymous as username and your email address as password to login "anonymously".

There you can find the \bundle\hg38\ directory with all the reference files needed for the pipeline. Note that the Homo_sapiens_assembly38.known_indels.vcf.gz is under the beta directory.

If you want to do this from the command line you can run the following command:

wget ftp://gsapubftp-anonymous:`your@emailhere.betweenbackticks`@ftp.broadinstitute.org/bundle/hg38/<target-file>