razorx89 / roco-dataset

Radiology Objects in COntext (ROCO): A Multimodal Image Dataset
175 stars 19 forks source link

Download fails after 11 tries #12

Closed koustav123 closed 1 year ago

koustav123 commented 1 year ago

Having the same issue as #11 even from ubuntu. Is the link active? I tried n 1 as suggested still not working.

image
saviola777 commented 1 year ago

It works fine for me. Unless you see the error "Giving up download of archive […]", there is no problem. The script retries until it either succeeds or exceeds the maximum number of re-tries. There is no direct message on success, that's why it might be misleading.

koustav123 commented 1 year ago

Hey, thanks for the quick response.

I checked this single source file: https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?tool=roco-fetch&email=johannes.rueckert@fh-dortmund.de&id=PMC3221140

It seems when I do a wget ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/b2/6b/PMC3221140.tar.gz ./ from the terminal it fails. Does this wget work for you?

saviola777 commented 1 year ago

Yes, it works.

wget ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/b2/6b/PMC3221140.tar.gz
--2023-05-30 13:29:52-- ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/b2/6b/PMC3221140.tar.gz => ‘PMC3221140.tar.gz’ Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 2607:f220:41f:250::230, 2607:f220:41f:250::228, 130.14.250.11, ... Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|2607:f220:41f:250::230|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/pmc/oa_package/b2/6b ... done. ==> SIZE PMC3221140.tar.gz ... 582582 ==> EPSV ... done. ==> RETR PMC3221140.tar.gz ... done. Length: 582582 (569K) (unauthoritative) PMC3221140.tar.gz 100%[===============================================>] 568,93K 1,05MB/s in 0,5s 2023-05-30 13:29:55 (1,05 MB/s) - ‘PMC3221140.tar.gz’ saved [582582]

But I know that the FTP server can be really brittle, downloads often fail or stall for no apparent reason. 😦 It could also be some networking issue. But again, unless you see the message "Giving up download of archive" it should still be fine even if it takes a couple of attempts to download some of the archives.

koustav123 commented 1 year ago

Hey, the problem was that my org. network blocks ftp service. The code works fine indeed. Thank you very much again. I will close this.