issues
search
huggingface
/
OBELICS
Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M documents, 115B text tokens and 353M images.
https://huggingface.co/datasets/HuggingFaceM4/OBELICS
Apache License 2.0
171
stars
9
forks
source link
LDA
#11
Closed
jrryzh
closed
3 weeks ago