Open dszpr opened 6 months ago
I have the same question. @dxli94 Could you please take a look? Besides, I don't quite get what's this for: empty_targets = (torch.ones(attsopt.size(), dtype=torch.long).to(image.device).fill(-100)) targets = torch.cat([empty_targets, targets], dim=1)
I have the same question. @dxli94 Could you please take a look? Besides, I don't quite get what's this for: empty_targets = (torch.ones(attsopt.size(), dtype=torch.long).to(image.device).fill(-100)) targets = torch.cat([empty_targets, targets], dim=1)
I attach a revised version of fine-tuning blip2_opt for VQA tasks here: https://github.com/salesforce/LAVIS/issues/125#issuecomment-2200960668. Can you help me check if it's correct? Thanks!
Hi! I noticed that in forward function in blip2_opt.py, only the questions in the VQA dateset are used. Both the text_input and the target lables are derived from the opt_tokens `
` In the VQA task, are the target lables supposed to be the answers in the VQA dataset? But the answers in the VQA datasetare not used in the blip2_opt.py. However, the answers are used as target lables in blip2_t5.py. It really confused. Have you make any change to blip2_opt.py?