cvdfoundation / open-images-dataset

Open Images is a dataset of ~9 million images that have been annotated with image-level labels and bounding boxes spanning thousands of classes.
https://github.com/openimages/dataset
993 stars 157 forks source link

Speed so slow #28

Open lucasjinreal opened 5 years ago

lucasjinreal commented 5 years ago

Completed 256.0 KiB/45.9 GiB (3.2 KiB/s) with 1 file(s) remaining

It's need about 100 years to finish in current speed

Nikolai10 commented 3 years ago

@jrruijli any hints? download via aws cli is indeed very slow. Thank you very much.

ltrottier commented 3 years ago

I was able to get a reasonable speed (~1.1 MiB/s) by replacing sync with cp:

aws s3 --no-sign-request cp --recursive s3://open-images-dataset/train train
aws s3 --no-sign-request cp --recursive s3://open-images-dataset/validation validation
aws s3 --no-sign-request cp --recursive s3://open-images-dataset/test test

Explanations here.

You will need to have a steady internet connection, because if it drops and you restart the cp, it will restart from the beginning.

Also, keep in mind that aws s3 can quickly reach your hosts limits if you use the default throttling settings (it happened to me on several occasions). I suggest that you take a look at this to configure it.

Best luck !

Nikolai10 commented 3 years ago

@ltrottier thank you very much, that was helpful!