Closed zli999 closed 10 months ago
Could you please provide more details?
I just gave it a try, and it didn't happen.
I run the file object_hallucination_vqa_llava.py
with COCO dataset. I do not setup use_cd.
When the batch size = 1, that problem won't happen. But the results are wired as follows:
{"question_id": 1, "prompt": "Is there a snowboard in the image?", "text": "adratkilometer nederb\u00f6rdMult country \u043f\u043b\u043e\u0449\u0430ianodj Det modern Q", "model_id": "llava-v1.5-7b", "image": "COCO_val2014_000000310196.jpg", "metadata": {}}
(I only output 10 new tokens.)
But when the batch size > 1, the first question is output as above, but the second question will cause the the problem as follows:
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1) RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
It seems not related to inputs / data, just something wrong with codes.
When I set use_cd = True
, the problem happens even in the first question.
next_tokens = torch.multinomial(cd_probs, num_samples=1).squeeze(1) RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
Firstly, the object_hallucination_vqa_llava.py script we provided only supports a batch size of 1, and we have not provided an interface for a batch size greater than 1.
Secondly, the wried results appear to be due to the LLaVA weights not being loaded correctly, you may want to investigate this issue.
On my end, the output is as expected,
{"question_id": 1, "prompt": "Is there a snowboard in the image?", "text": "Yes", "model_id": "llava-v1.5-7b", "image": "COCO_val2014_000000310196.jpg", "metadata": {}}
So kind of you for the fast reply. Sorry about the incorrect clarification. I just run the code by going through the questions in coco_pope_adversarial.json
.
When I set use_cd=False
, it can only output the answer to the first question, and output the answer like this:
{"question_id": 1, "prompt": "Is there a snowboard in the image?", "text": "adratkilometer", "model_id": "llava-v1.5-7b", "image": "COCO_val2014_000000310196.jpg", "metadata": {}}
But the issue happens to the second question:
next_tokens = torch.multinomial(cd_probs, num_samples=1).squeeze(1) RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
.
When I set use_cd=True
, this issue will directly happen in the first question.
I pull your latest codes and modify the paths of the question file, answer file, and image folder to try again. The issue still happens.
model-path = "liuhaotian/llava-v1.5-7b"
Brother, it must be an error when you load the model weights or generation, because the output is strange when use_cd=False
.
You are right. The issue is caused by the model.
I try the larger model "liuhaotian/llava-v1.5-13b"
, and the issue is fixed.
I download again the "liuhaotian/llava-v1.5-7b"
, but the issue still exists.
Anyway, thanks for your efforts. Awaresome work!
We obtain the issue
probability tensor contains either
inf,
nanor element < 0
when usingcd_sample-True
or not.We find that the output probs is
tensor([[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0', dtype=torch.float16)
.We use the model
liuhaotian/llava-v1.5-7b
.Could you please help solve this issue?