microsoft / SoM

Set-of-Mark Prompting for GPT-4V and LMMs
MIT License
1.2k stars 96 forks source link

A question about grayscale image #8

Closed HenryHZY closed 1 year ago

HenryHZY commented 1 year ago

Hi @jwyang I tested SoM+SAM with a script:

from PIL import Image
image = Image.open("gray.jpg")
output, mask = inference_sam_m2m_auto(model, image, text_size, label_mode, alpha, anno_mode)
image = Image.fromarray(output)
image.save("gray-som.jpg")

Running the script will throw some erros about image channel (e.g., image.shape[2]). It seems that SoM+SAM tool doesn't support grayscale image currently. Is there a mistake in my usage?

Does directly modifying some code to support grayscale maps affect the performance of SoM+SAM? For example:

image = image.convert('RGB')
jwyang commented 1 year ago

Hi, @HenryHZY, yes, adding the conversion line is a way of addressing the issue.

HenryHZY commented 1 year ago

Thanks a lot.