open-AIMS / ozfish

Public dataset of Australian fish species for advancing machine learning research
30 stars 3 forks source link

Issue downloading data #1

Open jonochang opened 4 years ago

jonochang commented 4 years ago

Hi!

I'm trying to download the frames from https://data.pawsey.org.au/public/?path=/FDFML/frames as a zip and it appears to die part way through. Is there a mirror of the dataset elsewhere, or is there a mechanism to resume downloads?

threehundred commented 4 years ago

Hi @jonochang - try contacting help@pawsey.org.au. If this is unsuccessful please let me know and I'll see if I can sort an alternate download location.

jonochang commented 4 years ago

Thanks for the prompt response, I've sent them an email to see if they support the download of large datasets.

hoalarious commented 4 years ago

@jonochang We don't seem to get good download speeds on the dataset either. I found that not all images may be needed during early testing. I wrote a few scripts to extract frames with multiple annotations from the metadata, download those selectively and generate a JSON file to be used in training. Perhaps you'll find a use for it too.

https://github.com/hoalarious/AIMS_dataset_conversion

gboeer commented 3 years ago

Hi, I tried to download your new annotations but the data provider PAWSEY shows offline for all files. Maybe this is just a temporary issue but I wanted to let you know, so you can check.

Edit: I noted this isn't true for all files but for these newer ones: https://data.pawsey.org.au/public/?path=/FDFML/labelled/measurementfiles https://data.pawsey.org.au/public/?path=/FDFML/labelled/speciesboxes

vedrusss commented 3 years ago

Thanks, @hoalarious , your download script is a good decision. I've improved it a bit - added multithreading download and safe download crash processing. Running it with param --threads=500 makes download really fast. Modified script attached download_dataset_from_json.py.md .