MIT-LCP / physionet

A collection of tools for working with the PhysioNet repository.
http://physionet.org/
MIT License
69 stars 17 forks source link

Cannot download using wget or gcp #145

Open mahela97 opened 1 year ago

mahela97 commented 1 year ago

When i tried to download the dataset using wget command it just created sub directories with index.html and when i tried using GCP it gives me this error "BadRequestException: 400 Bucket is a requester pays bucket but no user project provided."

tompollard commented 1 year ago

To download from Google Cloud, you'll need to specify a project ID for covering any download costs. See: https://stackoverflow.com/questions/47739741/bucket-is-requester-pays-bucket-but-no-user-project-provided

If you have a project ID then you can specify it in the download command (see: https://cloud.google.com/storage/docs/using-requester-pays#using). e.g. for gsutil:

gsutil -u PROJECT_IDENTIFIER cp gs://BUCKET_NAME/OBJECT_NAME OBJECT_DESTINATION

If you want to avoid download fees, you can download the data from the PhysioNet servers using the suggested wget command. This will be slower!

mahela97 commented 1 year ago

@tompollard thank you for the reply. I am trying using wget but inside the subdirectories, there is only an index.html file instead of images. could you please help me to fix that?

tompollard commented 1 year ago

@mahela97 I'm not clear what dataset you are trying to download, but essentially I think you'll need to be patient! wget loads the directory structure before files, I believe, so you may not immediately see data within directories.

mahela97 commented 1 year ago

@tompollard i am trying with the mimic-cxr. Thank you for the help. Will check after few hours.

tompollard commented 1 year ago

@mahela97 makes sense, it's a large dataset! I think this is the same issue described at:

I'd be interested in hearing how long the download takes to complete with wget.

edeiana23 commented 1 year ago

Was anybody able to download the dataset with wget command? I left my computer on the whole weekend but nothing happened. I still only see the folders and the index.html file but no images in dicom format. Asking specifically to @mahela97 @ayhyap who raised the issue previously.

ayhyap commented 1 year ago

Was anybody able to download the dataset with wget command? I left my computer on the whole weekend but nothing happened. I still only see the folders and the index.html file but no images in dicom format. Asking specifically to @mahela97 @ayhyap who raised the issue previously.

Yes the files eventually transferred, it just took a while. It is possible that some error occurred that caused your wget command to be interrupted. The command provided on the mimic-cxr page includes the relevant options to resume the transfer if it stopped prematurely.