Closed virajprabhu closed 3 years ago
Hey, indeed the URLs for YFCC images are broken and that isn't the best route, unfortunately. I haven't tried the PyPI repository myself but I believe since it hosts images separately on AWS it should be functional. Once I have some free time in the coming weeks I can try a script that just downloads relevant files from the corresponding shards to save time/space.
I see, okay. A script to download the relevant files would be very helpful! I'll try the PyPI package in the meantime. Thanks!
Thanks for releasing this dataset! I was able to download the YFCC images from the URL's provided in the metadata file for ~730k out of the 1.1 million images. The URL's for the rest unfortunately appear to be broken.
Do you have any recommendations for a source that has all the images? I was thinking of trying https://pypi.org/project/yfcc100m/ that you recommend in your README – have you had luck with using that to download all images?
Thanks!