OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Apache License 2.0
12.76k stars 894 forks source link

Error: Sizes of tensors must match except in dimension 1. Expected size 9 but got size 8 for tensor number 1 in the list. #590

Closed SrikanthChellappa closed 2 months ago

SrikanthChellappa commented 2 months ago

MiniCPM-V2.5-Llama Model Base - I see this error coming up for few images (and not for all images) and I could see it is not related to the image size or type. Pls assist on what could be the possible issues and suggest few potential fixes.

Code used for inference


msgs=[] msgs.append(dict(type='text', value=system_prompt)) msgs.append(dict(type='image', value=img_location)) msgs.append(dict(type='text', value=text)) content = [] for x in msgs: if x['type'] == 'text': content.append(x['value']) elif x['type'] == 'image': image = Image.open(x['value']).convert('RGB') content.append(image) conversation_history.append({"role": "user", "content": content})

res = model.chat( image=None, msgs=conversation_history, context=None, tokenizer=tokenizer, sampling=True, temperature=0.1, stream=True ) generated_text = "" for new_text in res: generated_text += new_text print(new_text, flush=True, end='')

SrikanthChellappa commented 2 months ago

It works now after increasing the "max_inp_length" param (default 2048). So I had an increased "max_inp_length" as below and it worked. I guess you go max upto 8196 for this param.

msgs=[] msgs.append(dict(type='text', value=system_prompt)) msgs.append(dict(type='image', value=img_location)) msgs.append(dict(type='text', value=text)) content = [] for x in msgs: if x['type'] == 'text': content.append(x['value']) elif x['type'] == 'image': image = Image.open(x['value']).convert('RGB') content.append(image) conversation_history.append({"role": "user", "content": content})

res = model.chat( image=None, msgs=conversation_history, context=None, tokenizer=tokenizer, sampling=True, temperature=0.1, max_inp_length=4096, #### THIS IS THE CHANGE ##### stream=True ) generated_text = "" for new_text in res: generated_text += new_text print(new_text, flush=True, end='')