File "/playground/InternLM-XComposer/examples/example_chat.py", line 38, in <module>
response, _ = model.chat(tokenizer, query=text, image=image, history=[], do_sample=False)
File "/root/miniconda3/envs/intern_clean/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/internlm/internlm-xcomposer2-vl-7b/a52d70f582fa5773dd7b297f3e1a4caf149dcf59/modeling_internlm_xcomposer2.py", line 500, in chat
image = self.encode_img(image)
File "/root/.cache/huggingface/modules/transformers_modules/internlm/internlm-xcomposer2-vl-7b/a52d70f582fa5773dd7b297f3e1a4caf149dcf59/modeling_internlm_xcomposer2.py", line 116, in encode_img
assert isinstance(image, torch.Tensor)
AssertionError
So if the finetuning following guidance is finished (each sample in JSON file consists of 1 or multiple images), how to evalute the performance on a new dataset in which each sample consists of 1 or multiple images as well? Thank you in advance!
Thank you for the favorable work! Inference on Multiple GPUs in README calls example_chat.py, but it seems like the code does not support multi-images as input. When I organize 2 images like Data preparation in finetune guidance, an error occurs:
So if the finetuning following guidance is finished (each sample in JSON file consists of 1 or multiple images), how to evalute the performance on a new dataset in which each sample consists of 1 or multiple images as well? Thank you in advance!