Closed MohammedSB closed 9 months ago
Downloading from the links is the definition of cc3m
Yea, I get that the purpose of CC3M is to be a non-curated, web-scraped dataset, but, sadly, downloading the original data from the links no longer works since many of the images are no longer available.
I am especially interested in all (or most, within +-100k) of the data because I want to compare training methods with results from the literature.
Hello,
A little bit of an unrelated question, but can someone please help me out on where to download the entire CC3M dataset? Is it hosted publicly somewhere on AWS/cloud?
A lot of the URLs in the google-provided database no longer work, so I was only able to download less than 3M out of the original 3.3M.
Would appreciate your help!