Open shuhao02 opened 11 months ago
Hi, I randomly sampled some captions from the laion-400m dataset using the webdataloader with shuffle on. Maybe curating them carefully to consist of hard negatives will be useful, but I didn't analyze this in detail.
Thanks.
Hi, thanks for your excellent work. I have a question regarding the regularization captions found in the file located at
./data/regularization_captions.txt.
I am quite curious about how these captions were obtained. The relevant descriptions or explanations in the paper seem only "∼ 1000 randomly sampled captions for regularization." Can you show more details about their origin or acquisition?