Closed yaohui120 closed 3 months ago
Thank you for bringing this to our attention! You are indeed correct in your understanding, and I have addressed the issue.
However, this bug does not affect the outcome, as the forward pass of MiniGPT-4
and BLIP-2
returns a class containing logits and labels. We utilize these labels for evaluation, not edit_inner['labels']
.
My apologies for any inconvenience caused! Please don't hesitate to get in touch if you need any assistance.
Oh, I used the old version code, and it's
post_edit_outputs = edited_model(batch["edit_outer"])
post_batch_labels = batch["edit_outer"]["labels"]
if not isinstance(post_edit_outputs, torch.Tensor):
post_edit_logits = post_edit_outputs.logits
else:
post_edit_logits = post_edit_outputs
I am testing the current version code. Thank you.
I test the newest code on VQA dataset and it doesn't have the problem I mention above. Thank you~
I found something weird in VQADataset. When the tokenizer changes the answer to numbers, I found:
Only the second one can be decoded as 'tomatoes' correctly. However, in function collate_fn, when batch_size=1, the format of trg is a list contained only one string, this will make edit_inner['labels'] wrong. I want to know if I understand it correctly.