allenai / mmc4

MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.
MIT License
901 stars 34 forks source link

Add the option of downloading images from the provided links. #9

Closed sramshetty closed 1 year ago

sramshetty commented 1 year ago

Hopefully, this will make the datasets' images more accessible to users. Also since I didn't have ImageMagick or want to install it, I opt to resize with Pillow when needed.