salesforce / BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
BSD 3-Clause "New" or "Revised" License
4.61k stars 615 forks source link

ITM Loss Stuck at 0.63 #200

Open bfan1256 opened 7 months ago

bfan1256 commented 7 months ago

Hi, I am trying to replicate and pretrain BLIP for distillation purposes - I am using Flickr30K + COCO and my ITM loss gets stuck at 0.63 - upon an initial look, all of the ITM predictions are 1. Is this a dataset size issue or a batch issue? I've tried changing the learning rate to a smaller LR, I've tried increasing the size of the model and more, but nothing seems to work.

bfan1256 commented 6 months ago

It seems to predict 0s for all of the ITM labels no matter what

ahorazahedi commented 5 months ago

so what is solution for this problem? i am encountering same problem