moein-shariatnia / OpenAI-CLIP

Simple implementation of OpenAI CLIP model in PyTorch.
MIT License
574 stars 85 forks source link

dataset flickr8k and caption #2

Closed aliman80 closed 2 years ago

aliman80 commented 2 years ago

Thanks for sharing your excellent implementation , I want to use this model for nus-wide dataset , can you provide me your dataset or any further guidance how can i use it for nus-wide dataset. As for nu-wide dataset i have dataset with class folders inside the flickr and i dont have any caption file which is requried for text encoder.

Can you help me in this , will be much appreciated

regards

moein-shariatnia commented 2 years ago

Thanks for sharing your excellent implementation , I want to use this model for nus-wide dataset , can you provide me your dataset or any further guidance how can i use it for nus-wide dataset. As for nu-wide dataset i have dataset with class folders inside the flickr and i dont have any caption file which is requried for text encoder.

Can you help me in this , will be much appreciated

regards

I am not familiar with the dataset you mentioned but in my initial search on it, it seems like it has only classes and not captions for each image. I'm not sure if CLIP model is a good option to be trained on this dataset. By the way, if you follow the dataset section of this tutorial, it will be really straight forward for you to use that dataset instead of the dataset I used.

aliman80 commented 2 years ago

1, Thank you for your quick response. That means i may use my data set but then I will have to generate captions for each image?

  1. Can I get the dataset that you used Sorry for your time again but it will be great help .

Thanks again

kartikra commented 2 years ago

Whats the difference between the captions.txt and captions.csv files. Which one is the right input?

moein-shariatnia commented 2 years ago

Whats the difference between the captions.txt and captions.csv files. Which one is the right input?

There is not much difference between them; the csv file has a "id" column. The rest is the same!

moein-shariatnia commented 2 years ago

1, Thank you for your quick response. That means i may use my data set but then I will have to generate captions for each image? 2. Can I get the dataset that you used Sorry for your time again but it will be great help .

Thanks again

  1. probably yes.
  2. yes, it's a public dataset. I used the one on Kaggle.