openai / CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
MIT License
25.69k stars 3.29k forks source link

Getting overfitted CLIP model while fine-tuning with a large training dataset #323

Open antu-saha opened 1 year ago

antu-saha commented 1 year ago

Hi, I want to fine-tune the CLIP model with my dataset. The size of the dataset is very large. There are more than 2M image-text pairs. I tried various learning rates including too-small learning rates. But during training, the loss is decreasing at first and after 2 or 3 epochs the loss is increasing again. I am using SGD optimizer. If I use a subset of my full dataset (smaller dataset), then the training is going well. As I have lots of captions for the same image (on average 110 captions for each image), I have created a dataset with a unique batch (all the image-text pairs are unique in the same batch). Due to memory constraints, I am using a batch size of 200. Still, I am getting the same training graph.

Could you please tell me why I am facing this issue? Is it because of my dataset size? What should I do for getting better fine-tuning?

Thank you very much.

Rohinivv96 commented 1 year ago

Hi @antu-saha Can you please guide me how we can fine tune CLIP model for custom dataset.

BaochaoZhu commented 7 months ago

Hi @antu-saha Can you please guide me how we can fine tune CLIP model for custom dataset.

antu-saha commented 7 months ago

Hi @Rohinivv96 and @BaochaoZhu, my apology for the late reply. Sure, I can guide you to fine-tune CLIP. We can fine-tune CLIP using the pretrained model. We just need to prepare our dataset. In my case, I have images with their corresponding captions. I created 2 lists (image paths and captions) and used those lists to create the batches. I can also share the code if needed. Please don't hesitate to reach out if you wanna know more information. Thank you very much.

BaochaoZhu commented 7 months ago

@antu-saha Thank you for your assistance. Would you mind sharing the code with me or emailing it to zhubaochao666888@gmail.com? I would like to learn from it. Thank you very much.