kkdey / GSSG

Gene Set + S2G strategy annotations analyzed for disease architecture
45 stars 12 forks source link

Downloading data for SNP annotation #20

Closed emdann closed 1 year ago

emdann commented 1 year ago

Hello, apologies if this is a dumb question, but how does one download all the data as mentioned in https://github.com/kkdey/GSSG#step-2-gene-setprogram-to-snp-annotation ? Using recursive wget I get a 404 error.

$ wget -r https://alkesgroup.broadinstitute.org/LDSCORE/Jagadeesh_Dey_sclinker/extras/
--2022-11-17 15:40:31--  https://alkesgroup.broadinstitute.org/LDSCORE/Jagadeesh_Dey_sclinker/extras/
Resolving alkesgroup.broadinstitute.org (alkesgroup.broadinstitute.org)... 34.120.167.96
Connecting to alkesgroup.broadinstitute.org (alkesgroup.broadinstitute.org)|34.120.167.96|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2022-11-17 15:40:31 ERROR 404: Not Found.

Many thanks for your help!

ktpolanski commented 1 year ago

Once I added storage.googleapis.com to the googling, I was able to find gsutil. Once installed, just replace https://storage.googleapis.com/ with gs:// and the folders can be downloaded:

gsutil cp -r gs://broad-alkesgroup-public/LDSCORE/Dey_Enhancer_MasterReg/processed_data .
gsutil cp -r gs://broad-alkesgroup-public/LDSCORE/Jagadeesh_Dey_sclinker/extras .

It would be nice if the instructions were fleshed out a bit, and/or the files were moved to a more accessible location.

kkdey commented 1 year ago

Thanks @ktpolanski for the answer; yes that would be the way to download all the files in a batch. I have added the command to the README. Apologies for the instructions not being fleshed out, we are recruiting a person to work on simplifying the pipeline, so hopefully things look better in a few months. @emdann Feel free to let me know if you face any issues. Thanks!

emdann commented 1 year ago

Thanks for the quick reply, that's great to hear!

weifangliu commented 1 year ago

When I ran gsutil cp -r gs://broad-alkesgroup-public/LDSCORE/Jagadeesh_Dey_sclinker/extras $datadir, it gave me this error:

AccessDeniedException: 403 \<my gmail address> does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist).

Do I need permission to access the data?

kkdey commented 1 year ago

@weifangliu We are making some changes to our bucket set up to lower egress rates. We will be back up by the end of the week - there may be some modifications to access paths. Please bear with us, thanks for using sc-linker and sorry for this inconvenience.

weifangliu commented 1 year ago

@weifangliu We are making some changes to our bucket set up to lower egress rates. We will be back up by the end of the week - there may be some modifications to access paths. Please bear with us, thanks for using sc-linker and sorry for this inconvenience.

Thank you for the update!

kkdey commented 1 year ago

@weifangliu The webpage is back up now.