Is LLaVA-NeXT-interleave 7B model availble?

jinsikbang commented 3 months ago

Hello. Thank you for your great works.

I faced the issue that

File "/home3/user/mllm/LLaVA_NeXT/llava/mm_utils.py", line 377, in call if output_ids[0, -keyword_id.shape[0] :] == keyword_id: RuntimeError: Boolean value of Tensor with more than one value is ambiguous

in interleave_demo.py.

However, That problem caused by conv_mode I think.

conv_mode = "llava_v0". Is it correct? and where can I find that mode version per models?

The KeywordStopCreteria is tensor([2, 2, 2], device='cuda:0').

The generated output is tensor([151645], device='cuda:0').

How can I fix it? python 3.10.x version also occurred the same error.

CUDA version is 11.7 Pytorch version is 2.0.1. Or is this pytorch and CUDA version >2.0.1, >11.7?

k-nearest-neighbor commented 3 months ago

Forgive me if i'm not following perfectly, but were you trying the llava-next-interleave-7b available at https://huggingface.co/collections/lmms-lab/llava-next-interleave-66763c55c411b340b35873d1 ? Reproduction steps would help. (Note: I'm not a maintainer, just a user browsing these issues)

jinsikbang commented 3 months ago

Forgive me if i'm not following perfectly, but were you trying the llava-next-interleave-7b available at https://huggingface.co/collections/lmms-lab/llava-next-interleave-66763c55c411b340b35873d1 ? Reproduction steps would help. (Note: I'm not a maintainer, just a user browsing these issues)

Yes, I tried already. Reproduction steps also occurred the same error. I changed conv_mode to qwen_1_5, but even it failed :(

When I upgrade transformers to 4.42, the model generates some sentence, but it's not understandable. (My transformers and tokenizers version are 4.40 / 0.19.1.)

If it runs well when you tried, can you tell me the contents of the pip list?

YuanxunLu commented 3 months ago

same issue here.

File "/media/user/D/Codes/LLaVA-NeXT/llava/mm_utils.py", line 375, in call if output_ids[0, -keyword_id.shape[0] :] == keyword_id: RuntimeError: Boolean value of Tensor with more than one value is ambiguous

HaoZhang534 commented 3 months ago

You should change the model path from llava-next-interleave-7b to llava-next-interleave-qwen-7b and try again.

luyao-cv commented 2 months ago

python playground/demo/interleave_demo.py --model_path ../weights/llava-next-interleave-qwen-7b 我的运行命令是这样的，也报错了，请问有解决方式吗？

您应该将模型路径从 llava-next-interleave-7b 更改为 llava-next-interleave-qwen-7b，然后重试。

luyao-cv commented 2 months ago

same issue here.

File "/media/user/D/Codes/LLaVA-NeXT/llava/mm_utils.py", line 375, in call if output_ids[0, -keyword_id.shape[0] :] == keyword_id: RuntimeError: Boolean value of Tensor with more than one value is ambiguous

请问你有解决吗？

uebian commented 3 weeks ago

I encountered the same issue and fixed it by applying the following patch:

diff --git a/llava/mm_utils.py b/llava/mm_utils.py
index 62a3e50..43dac68 100755
--- a/llava/mm_utils.py
+++ b/llava/mm_utils.py
@@ -386,7 +386,7 @@ def __call__(self, output_ids: torch.LongTensor, scores: torch.FloatTensor, **kw
         offset = min(output_ids.shape[1] - self.start_len, 3)
         self.keyword_ids = [keyword_id.to(output_ids.device) for keyword_id in self.keyword_ids]
         for keyword_id in self.keyword_ids:
-            if output_ids[0, -keyword_id.shape[0] :] == keyword_id:
+            if torch.equal(output_ids[0, -keyword_id.shape[0] :], keyword_id):
                 return True
         outputs = self.tokenizer.batch_decode(output_ids[:, -offset:], skip_special_tokens=True)[0]
         for keyword in self.keywords:

LLaVA-VL / LLaVA-NeXT

Is LLaVA-NeXT-interleave 7B model availble? #88