Open rom1504 opened 3 years ago
oh I see this repo https://github.com/mlfoundations/open_clip#yfcc-and-other-datasets has support it might be another example
I think this would be a helpful addition to the repo, however, my main short-term focus is a collaboration with the team behind that repo.
If you or anyone else reading is interested in seeing this addition to the repo I'd be glad to accept a PR!
I think it could be pretty useful to add a webdataset loader to this, so webdataset datasets can be used here. This is relevant as large webdataset are starting to be available (one is crawling at home of size 400M)
I think https://github.com/lucidrains/DALLE-pytorch/pull/280/files may be a good example on how to do it