gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
34.11k stars 2.59k forks source link

Infer image mode when setting Image.image_mode #9931

Open meg-huggingface opened 1 week ago

meg-huggingface commented 1 week ago

Is your feature request related to a problem? Please describe.
When using .png files with a transparent background as a watermark as an Image component, the appropriate image_mode to preserve transparency (RGBA) is not assigned. Instead, it defaults to RGB, which makes the transparency colored (black or white, depending on which image I tested).

Describe the solution you'd like
'Infer' the appropriate image_mode given the file extension (e.g., assign 'RGBA' when file ends in .png). This could involve, for example, setting the image_mode default value to None (instead of setting the default value to RGB, as happens currently)

Additional context
The Image component image_mode parameter is described as: "If set to None, the image_mode will be inferred from the image file type (e.g. "RGBA" for a .png image, "RGB" in most other cases)." However, the default value , when None is not explicitly provided, is RGB.

abidlabs commented 1 week ago

Thanks @meg-huggingface, see https://github.com/gradio-app/gradio/pull/9932#issuecomment-2471563709, this is intended behavior. Open to discussing whether we should change this, but the earliest we could make this change would be 6.0 to avoid breakage so I'm inclined to close.

meg-huggingface commented 1 week ago

Thanks @abidlabs , So, here's the issue with the 'RGB' default: It breaks transparency; there's no transparency channel with 'RGB'.

Say I want to upload a .png file with transparency, using Image(), e.g., Image(filepath='path/to/file.png') The transparency will become either black or white (depending on the file). Screenshots attached: One with the 'default' behavior (black background), and one with the transparency-preserving behavior (no black background)

Screenshot 2024-11-12 at 1 58 23 PM Screenshot 2024-11-12 at 1 57 42 PM

And here are two identical Spaces, using images within Examples. One uses gr.Image(type='filepath'): https://huggingface.co/spaces/meg/watermark_demo_default -- this is the current default behavior. Notice the transparent background becoming black.

And one uses gr.Image(type='filepath', image_mode='RGBA'): https://huggingface.co/spaces/meg/watermark_demo -- this throws an error for a jpeg upload. See the 4th image and the corresponding logs, OSError: cannot write mode RGBA as JPEG

One solution is to always specify the image_mode if it's not RGB. But: How does a user know the image_mode beforehand? They'd have to write an extra function to get the encoded image type.

A function that aims to pass an image through gradio without conversion would therefore require either:

  1. An additional function to get the image_mode. This is a bit of a shame, since PIL already does that under the hood in gradio -- it would be duplicative, in addition to extra coding for details that are a bit in-the-weeds, given gradio's general user-friendliness.
  2. Specifying that users can only upload a specific image extension, e.g., only .png or only .jpg, which seems overly constraining for general use cases.

Further details available in this Slack comment: https://huggingface.slack.com/archives/C02SPHC1KD1/p1731355598870579?thread_ts=1731355373.324029&cid=C02SPHC1KD1

meg-huggingface commented 1 week ago

All that said: I spent a bit of time looking through all of this, and I think I do see how everything can be altered so that the image_mode will be inferred & the tests will all still pass. I'd be happy to submit a PR that fully addresses this issue, since I've already done much of the work in my fork -- including changes to tests, and adding additional tests -- but if there are also Spaces that might break, that the tests aren't capturing, let me know so I can check those too. The main issue I see is that gradio is now built on an assumption of RGB for the image_mode, so code that requires that any image passed in, regardless of extension, comes back as RGB, would need to be updated to specify image_mode='RGB'