allenai / mmc4

MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.
MIT License
904 stars 34 forks source link

Dataset available on Huggingface? #20

Open snat-s opened 10 months ago

snat-s commented 10 months ago

Hi guys! love the dataset, i want to use this dataset on for some training I am going to do and I want to use huggingface datasets. I can do it and make it public as long as you are cool with it. Let me know if you have any issues with it.