maybe bug in processing_mplugowl3.py ?

hpy-42 commented 3 months ago

https://huggingface.co/mPLUG/mPLUG-Owl3-7B-240728/blob/main/processing_mplugowl3.py#L232

When self.image_processor.add_global set to True, i think image_token_ptr should += 1 one more time during the loop...


for next_text in text_list[1:]:
    text += self.image_processor.cut_prompt_template(img_token='<|image|>', h=cut_shape[image_token_ptr][0], w=cut_shape[image_token_ptr][1])
    text += next_text
    image_token_ptr += 1
    ### ptr to next image
    if self.image_processor.add_global:
        image_token_ptr += 1
message['content'] = text

LukeForeverYoung commented 2 months ago

Thank you for pointing this issue. It is a bug and we will fix it soon. In our demo and evaluation, we turn off image cut, so it temporarily does not affect the performance of the model in most scenarios.

LukeForeverYoung commented 2 months ago

We fixed this issue and updated the code in Hugging Face and ModelScope. However, we found that because we never trained the model with multi-image input when enabling the image cut, the performance is suboptimal. We will improve this weakness in the next model release.

X-PLUG / mPLUG-Owl

maybe bug in processing_mplugowl3.py ? #231