Open adamjstewart opened 3 years ago
Current workaround pointed out by @calebrob6:
$ wget "ftp://m1474000:m1474000@dataserv.ub.tum.de/ROIs1158_spring_lc.tar.gz"
Sorry for the late reply! I would prefer rsync: "The data server also offers downloads with rsync (password m1474000): rsync rsync://m1474000@dataserv.ub.tum.de/m1474000/"
Hi @schmitt-muc, when I run that command it doesn't download anything.
I'm trying to write a PyTorch data loader. Torchvision is able to automatically download and checksum datasets from a URL, but the FTP and rsync URLs don't work for this.
I have just checked (running Ubuntu 20.04 LTS from inside Windows 10 Enterprise using WSL2):
Running the command
rsync -chavzP --stats rsync://m1474000@dataserv.ub.tum.de/m1474000/ path/to/your/local/storage/folder
works. Of course you first have to enter the password m1474000
, and of course retrieving the incremental file list takes ages, but it should do the job.
Yes, that seems to work, although I still can't download the data from Python without calling some system rsync executable. A normal URL would be much nicer for cases where users aren't using rsync.
Ah, now I understand. I suggest following Caleb Robinson's advice. At least for me wget -r "ftp://m1474000:m1474000@dataserv.ub.tum.de"
does the job just fine and downloads the whole package automatically.
Yes, that URL works with wget
but not with Python's urllib
for some reason. Is there a working https://
option?
I have sent an inquiry to TUM's library, which hosts the data on their media server. The response won't make you too happy: There is definitely no https://
option, as also the .zip file you can download when clicking the Download button in the graphical interface is only created on the fly using some internal Nextcloud function. The only suggestion I got was to look into the Python libraries ftplib
, wget
and urllib2
, which are dedicated to ftp downloads.
There also seems to be a mirrored version on Google Cloud Storage, see https://gitlab.com/frontierdevelopmentlab/disaster-prevention/sen12ms: gsutil -m rsync -r gs://fdl_floods_2019_data/SEN12MS
.
Not sure whether this is of any help for you, though
Hi, I'm working on a torchvision-style dataset that automatically downloads and checksums SEN12MS. I see that the dataset is hosted on https://dataserv.ub.tum.de/s/m1474000. However, when I try to download one of the files, I get an error message:
Clicking on the download button allows me to download through the web browser, but I would like to be able to download from the command line. Is this possible (without disabling security certificate checks)?
@calebrob6