fwang91 / IMDb-Face

A new large-scale noise-controlled face recognition dataset.
431 stars 66 forks source link

Downloading is too slow. #7

Open LCorleone opened 6 years ago

LCorleone commented 6 years ago

Great job! I use python urllib. Maybe I am in China, the url for downloading is too slow. Is there any way to deal with it? or is there anyone to share the dataset? Thanks.

hwfan commented 6 years ago

I think you can use proxy servers to accelerate your access to these images on Amazon.

braveapple commented 6 years ago

@LCorleone You can use multi-threading or proxy servers in python to speed up downloading images. The code is available.

LCorleone commented 6 years ago

@BraveApple Thanks,nice work!

smartwell commented 6 years ago

i dont know why i must use proxy servers to download picture, it is maddening when requests post break

smartwell commented 6 years ago

i dont know why i must use proxy servers to download picture, it is maddening when requests post break

i give up

wangx404 commented 6 years ago

Actually, the best way to download such datasets is to use cloud server. I used to use AWS to do this. However, there is still a problem waiting for us. It's very slow to upload datasets to our computer in China. Even using bypy to do this, it still sucks!

wangx404 commented 6 years ago

I wrote a script yesterday to download the dataset on AWS. After 12 hours, 600k images have been downloaded. (About 20% of the image links no long exist.) Even I have croped the face from the raw image, the dataset is still very huge. I think it would have a size of 55G when the whole dataset was downloaded.

wangx404 commented 6 years ago

Finally, I finished. It's about 50G, with about 17% links expired.

superzrx commented 5 years ago

@wangx404 could you share the cropped data?

danielkaifeng commented 5 years ago

Could someone kindly upload the downloaded data to BaiduYun ?

Dantju commented 5 years ago

why IMDb-Face.csv only has 1048576 images?have u downloaded all dataset?

xianyujie commented 5 years ago

@wangx404 Could you share your download data to BaiduYun? many thx.