Open christophschuhmann opened 2 years ago
Hi! What do we need to do with this dataset? Create a loader or something else? Thanks!
We would need the dataset in this format: https://github.com/LAION-AI/dataset-spec
as webdataset tar files
Do I need to upload this dataset somewhere or what should I need to do with it afterwards?
I will PM you on discord access details to a server, where you can upload it. I will copy it later to our S3 buckets. :)
If the data is public, i think it could be good to put the processed version in a public place as well, and not only the private S3 For example huggingface datasets could be such a public place
Is the data public / redistributable @marianna13 ?
Yes, it's licensed by MIT license
https://paperswithcode.com/dataset/visual7w