kakaobrain / coyo-dataset

COYO-700M: Large-scale Image-Text Pair Dataset
https://kakaobrain.com/contents?contentId=7eca73e3-3089-43cb-b701-332e8a1743fd
1.16k stars 36 forks source link

Average caption length for COYO-700M #15

Closed BIGBALLON closed 7 months ago

BIGBALLON commented 7 months ago

is there any dataset analyse for COYO-700M, especially the average caption length?

mwbyeon commented 7 months ago

See the links below. https://github.com/kakaobrain/coyo-dataset?tab=readme-ov-file#statistics https://lookerstudio.google.com/reporting/d0929bbe-e617-4a2d-8e84-ec349a97e3d0/page/p_654gl4u4xc?s=jvwkG5XCzYI

BIGBALLON commented 7 months ago

@mwbyeon thanks !!!