pcingola / SnpEff

Other
247 stars 78 forks source link

Databases are returning "The specified blob does not exist" #398

Closed MikeWLloyd closed 2 years ago

MikeWLloyd commented 2 years ago

The majority of Azure blobs are again not accessible. I am starting a new issue for this as #374 spun out into other topics.

Example:

https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCm38.99.zip

This XML file does not appear to have any style information associated with it. The document tree is shown below.
<Error>
<Code>BlobNotFound</Code>
<Message>
The specified blob does not exist. RequestId:a24c8a25-a01e-001e-1896-59fe97000000 Time:2022-04-26T17:53:08.7641082Z
</Message>
</Error>
pcingola commented 2 years ago

Works OK for me, please check that this is not related to your local environment / temporary failure:

$ snpeff download -v GRCm38.99
00:00:00 SnpEff version SnpEff 5.1d (build 2022-04-19 15:49), by Pablo Cingolani
00:00:00 Command: 'download'
00:00:00 Reading configuration file 'snpEff.config'. Genome: 'GRCm38.99'
00:00:00 Reading config file: /Users/pcingola/snpEff/snpEff.config
00:00:00 done
00:00:00 Downloading database for 'GRCm38.99'
00:00:00 Downloading from 'https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCm38.99.zip' to local file '/var/folders/s9/y0bgs3l55rj_jkkkxr2drz4157r1dz/T//snpEff_v5_1_GRCm38.99.zip'
00:00:00 Connecting to https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCm38.99.zip
00:00:01 Connecting to https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCm38.99.zip, using proxy: false
00:00:01 ERROR while connecting to https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCm38.99.zip
00:00:01 Downloading from 'https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_GRCm38.99.zip' to local file '/var/folders/s9/y0bgs3l55rj_jkkkxr2drz4157r1dz/T//snpEff_v5_0_GRCm38.99.zip'
00:00:01 Connecting to https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_GRCm38.99.zip
00:00:01 Connecting to https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_GRCm38.99.zip, using proxy: false
00:00:02 Local file name: '/var/folders/s9/y0bgs3l55rj_jkkkxr2drz4157r1dz/T//snpEff_v5_0_GRCm38.99.zip'
......
RamRS commented 2 years ago

I can confirm that I ran into the same error (blob not found) while trying to download GRCh38.99. I'll try again in a bit and let you know if it's resolved now.

RamRS commented 2 years ago

The error persists:

[user@compute-node snpEff]$ java -jar snpEff.jar download GRCh38.99
00:00:00 ERROR while connecting to https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCh38.99.zip

Visiting the link manually, I see:

<Error>
<Code>BlobNotFound</Code>
<Message>
The specified blob does not exist. RequestId:542b95a6-b01e-0002-30aa-59acf7000000 Time:2022-04-26T20:19:14.1005269Z
</Message>
</Error>
MikeWLloyd commented 2 years ago

Could this be a firewall or local vs. external setting issue or permissions set issue? I also still can not get many of links to work. I’ve tried several machines and browsers.

I will say that some of the links do work, but not the majority.

pcingola commented 2 years ago

I think the I now understand what the problem is, let me explain:

Solution: Upgrading to the latest release should solve the issues. Are you using the latest 5.1 D version? If not, please upgrade

Description of the issue: As of 5.1 latest versions, SnpEff has a "fallback" mechanism for "compatible database formats". If database format version 5.1 is backward compatible with "5.0", then we can use the "old" (5.0) databases in SnpEff version 5.1

For a concrete example: Let's say GRCh38.99 was built for "SnpEfff version 5.0". When I released version 5.1 I don't need to rebuild GRCh38.99 (and all the other 40,000 databases) again for 5.1, we can just use the "old" 5.0 versions.

How does it work? SnpEff checks the database in "5.1" directory, if it doesn't work it fallback to "5.0" dir. If you take a look a the snpeff download -v command, you'll see that it tries two times:

$ snpeff download -v GRCm38.99
00:00:00 SnpEff version SnpEff 5.1d (build 2022-04-19 15:49), by Pablo Cingolani
00:00:00 Command: 'download'
00:00:00 Reading configuration file 'snpEff.config'. Genome: 'GRCm38.99'
00:00:00 Reading config file: /Users/pcingola/snpEff/snpEff.config
00:00:00 done
00:00:00 Downloading database for 'GRCm38.99'
00:00:00 Downloading from 'https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCm38.99.zip' to local file '/var/folders/s9/y0bgs3l55rj_jkkkxr2drz4157r1dz/T//snpEff_v5_1_GRCm38.99.zip'
00:00:00 Connecting to https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCm38.99.zip
00:00:01 Connecting to https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCm38.99.zip, using proxy: false
00:00:01 ERROR while connecting to https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCm38.99.zip
00:00:01 Downloading from 'https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_GRCm38.99.zip' to local file '/var/folders/s9/y0bgs3l55rj_jkkkxr2drz4157r1dz/T//snpEff_v5_0_GRCm38.99.zip'
00:00:01 Connecting to **https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_GRCm38.99.zip**
00:00:01 Connecting to https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_GRCm38.99.zip, using proxy: false
00:00:02 Local file name: '/var/folders/s9/y0bgs3l55rj_jkkkxr2drz4157r1dz/T//snpEff_v5_0_GRCm38.99.zip'
......

So, it starts with https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_GRCm38.99.zip it doesn't find it there, so it proceeds with the 5.0 directory https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_GRCm38.99.zip, where the blob is found, thus proceeds to download and install it from there.

I hope this clarifies the issue and how to solve it.

MikeWLloyd commented 2 years ago

@pcingola Thank you for clarifying! That was hugely helpful. It turned out I was missing the simple inclusion of the -v flag. It wasn't present in the online documentation, which tripped me up.

I was able to pull with the -v option. I am seeing separate Java issues now, but this is likely down to my local install.

Thanks again for the detailed explanation, really helped me to finally understand what is going on.

pcingola commented 2 years ago

Great to hear it works now. Sorry the -v flag is missing in the docs, but it works across ALL sub-command sin bot SnpEff and SnpSift.