bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
992 stars 354 forks source link

Error installing hg38 dbsnp data: ERROR 404: Not Found #3707

Open amizeranschi opened 1 year ago

amizeranschi commented 1 year ago

On bcbio-nextgen v1.2.9, the command

bcbio_nextgen.py upgrade -u skip --genomes hg38 --aligners bwa

results in the following error:

Running GGD recipe: hg38 dbsnp 156-20230320
--2023-05-10 01:03:20--  http://ftp.ncbi.nih.gov/snp/archive/b156/VCF/GCF_000001405.38.gz
Resolving ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)... 130.14.250.12, 130.14.250.10, 2607:f220:41f:250::228, ...
Connecting to ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)|130.14.250.12|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://ftp.ncbi.nih.gov/snp/archive/b156/VCF/GCF_000001405.38.gz [following]
--2023-05-10 01:03:21--  https://ftp.ncbi.nih.gov/snp/archive/b156/VCF/GCF_000001405.38.gz
Connecting to ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)|130.14.250.12|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2023-05-10 01:03:21 ERROR 404: Not Found.

Traceback (most recent call last):
  File "/data/share/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 228, in <module>
    install.upgrade_bcbio(kwargs["args"])
  File "/data/share/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/install.py", line 109, in upgrade_bcbio
    upgrade_bcbio_data(args, REMOTES)
  File "/data/share/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/install.py", line 361, in upgrade_bcbio_data
    args.cores, ["ggd", "s3", "raw"])
  File "/home/ubuntu/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 354, in install_data_local
    _prep_genomes(env, genomes, genome_indexes, ready_approaches, data_filedir)
  File "/home/ubuntu/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 480, in _prep_genomes
    retrieve_fn(env, manager, gid, idx)
  File "/home/ubuntu/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 875, in _install_with_ggd
    ggd.install_recipe(os.getcwd(), env.system_install, recipe_file, gid)
  File "/home/ubuntu/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/ggd.py", line 30, in install_recipe
    recipe["recipe"]["full"]["recipe_type"], system_install)
  File "/home/ubuntu/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/ggd.py", line 62, in _run_recipe
    subprocess.check_output(["bash", run_file])
  File "/data/share/bcbio-nextgen/anaconda/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/data/share/bcbio-nextgen/anaconda/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['bash', '/data/share/bcbio-nextgen/genomes/Hsapiens/hg38/txtmp/ggd-run.sh']' returned non-zero exit status 8.
prsliwa commented 1 year ago

Not sure if this is right but I see that .38 is gone and .40 is now available https://ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/

WuRAFY commented 1 year ago

Met this problem too. Is there fixation now?

rchekaluk commented 1 year ago

Presumably this has been closed by https://github.com/chapmanb/cloudbiolinux/pull/412