mlfoundations / open_clip

An open source implementation of CLIP.
Other
9.29k stars 923 forks source link

custom dataset #786

Closed mayilin0714 closed 2 months ago

mayilin0714 commented 6 months ago

Hello, I'd like to convert local data into the webdataset format for model training. My data consists of image files in jpg format and their corresponding image description txt files. Would it be sufficient to pack these files into tar archives for this purpose?