BAAI-DCAI / Bunny

A family of lightweight multimodal models.
Apache License 2.0
799 stars 61 forks source link

bunny-llama3图像处理问题 #52

Closed junming-yang closed 2 months ago

junming-yang commented 2 months ago
File "xxx/bunnyllama3.py", line 36, in generate_inner
    image_tensor = self.model.process_images([image], self.model.config).to(dtype=self.model.dtype)
  File "xxx/.cache/huggingface/modules/transformers_modules/BAAI/Bunny-Llama-3-8B-V/f2df3cf03156eaba4c34815675d5aac9a9e0bec2/modeling_bunny_llama.py", line 2771, in process_images
    image = self.expand2square(image, tuple(int(x * 255) for x in image_processor.image_mean))
  File "xxx/.cache/huggingface/modules/transformers_modules/BAAI/Bunny-Llama-3-8B-V/f2df3cf03156eaba4c34815675d5aac9a9e0bec2/modeling_bunny_llama.py", line 2758, in expand2square
    result = Image.new(pil_img.mode, (height, height), background_color)
  File "/root/miniconda3/envs/tr440/lib/python3.9/site-packages/PIL/Image.py", line 2941, in new
    return im._new(core.fill(mode, size, color))
TypeError: color must be int or single-element tuple

pillow == 10.2.0 transformers == 4.40.0 你好,在bunny-llama3图像处理中对于一些黑白照片,存在该报错问题,我将expand2square函数中的background_color由grey (127, 127, 127)修改为'white'后无报错,请问这样修改是否可以

Isaachhh commented 2 months ago

In our training and evaluation, we pad the image with siglip.image_processor.image_mean (0.5) * 255 = 127. And it doesn't work for grayscale image.

After changing it to 'white', the model can work but it may result in some performance loss (not sure.

junming-yang commented 2 months ago

How to correctly handle the inference of such black-and-white images? 1015

Isaachhh commented 2 months ago

image = Image.open('math.jpg').convert('RGB')