broadinstitute / gnomad-browser

Explore gnomAD datasets on the web
https://gnomad.broadinstitute.org
MIT License
81 stars 44 forks source link

404 for Variant co-occurrence / Microsoft #1353

Closed lindenb closed 7 months ago

lindenb commented 11 months ago

Hi Gnomad Team

in https://gnomad.broadinstitute.org/downloads#v2-variant-cooccurrence

the link for microsoft https://datasetgnomad.blob.core.windows.net/dataset/release/2.1.1/ht/exomes_phased_counts_0.05_3_prime_UTR_variant_vp.ht

returns a 404

What you did:

$ wget "https://datasetgnomad.blob.core.windows.net/dataset/release/2.1.1/ht/exomes_phased_counts_0.05_3_prime_UTR_variant_vp.ht"
--2023-12-08 14:19:09--  https://datasetgnomad.blob.core.windows.net/dataset/release/2.1.1/ht/exomes_phased_counts_0.05_3_prime_UTR_variant_vp.ht
(...)
Proxy request sent, awaiting response... 404 The specified blob does not exist.
2023-12-08 14:19:10 ERROR 404: The specified blob does not exist..
rileyhgrant commented 8 months ago

Hiya, sorry for the incredible delay on this.

I hope you've managed to get the data from one of the other providers in the meantime. It looks like we'll have to re-sync this data then this link should work, the data appears to be there in AWS and GCS.

As I understand it, when the methods team copies the data to AWS, the sync to Azure is typically automatic. It appears something went wrong for this one.

sjahl commented 7 months ago

Hi @lindenb, @rileyhgrant , I don't think the table is actually missing here... It's just that hail tables are technically just folders with many other files in them, so wgeting that top-level folder isn't actually a meaningful request. There's a lot of detail here in how object stores work, and why this particular problem gets presented this way, which I won't go into. Long story short, you'll need to issue a request that indicates you want to download all the objects that start with this exomes_phased_counts_0.05_3_prime_UTR_variant_vp.ht filepath.

If you need to use wget, you'll have to dig through its (and Azure's) documentation to figure out the right incantation. azcopy is the tool that we recommend using on the Downloads page, and you can pull that hail table down into your current directory via:

azcopy copy --recursive "https://datasetgnomad.blob.core.windows.net/dataset/release/2.1.1/ht/exomes_phased_counts_0.05_3_prime_UTR_variant_vp.ht" .