gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
34.24k stars 2.6k forks source link

Add event to `gr.AnnotatedImage` that allows getting coordinates #6499

Open jumutc opened 1 year ago

jumutc commented 1 year ago

Is your feature request related to a problem? Please describe.
Now there is no click event available for AnnotatedImage and it is impossible, for instance, to do anything with annotations being displayed on the top of the image. Imagine you would like to re-label them or do just anything else Gradio framework is capable of.

Describe the solution you'd like
AnnotatedImage class and other related frontend files should we adjusted accordingly.

Additional context
No additional files or context

abidlabs commented 1 year ago

Hi @jumutc the gr.AnnotatedImage component supports a .select() event which accomplishes what you are describing. Please see https://github.com/gradio-app/gradio/blob/379b6f662ebacc04298da93ef4cd61922c2fdf71/demo/image_selections/run.py for an example

jumutc commented 1 year ago

@abidlabs there is Image that supports it and for AnnotatedImage nothing close exists.

jumutc commented 1 year ago

I would like to contribute the PR which solves this issue.

abidlabs commented 1 year ago

The .select() event for AnnotatedImage accomplishes what the .click() event for Image accomplishes and more

jumutc commented 1 year ago

But what if I have many examples of the same class I need to differentiate and I don't want to clutter the annotations below the image and masks @abidlabs? Select doesn't solve this

jumutc commented 1 year ago

Another use case covers the case when I need to select several annotations at once on the mask itself.

jumutc commented 1 year ago

All is based on the real usecase scenarios I was lacking from gradio in my own app.

abidlabs commented 1 year ago

Doesn't the .click() event you added in your PR provide the same info as the existing .select() event. Or am I missing something?

jumutc commented 1 year ago

No, it provides the coordinates of the click instead of the annotation index. Very different info indeed.

abidlabs commented 1 year ago

But couldn't you get the annotation index from the coordinates?

jumutc commented 1 year ago

I need coordinates to group annotations, and edit them in bulk.

abidlabs commented 1 year ago

I'll reopen this issue so that we can think through it a bit more. I think we'd need to understand the use case a bit better, if you'd like to elaborate more on what you're using this for.

In the meantime, we've made it possible for Gradio users to create their own custom components -- meaning that you could take your changes and publish it as a new Gradio component that you or anyone else could use. Here are some examples of custom Gradio components:

* A "Rich Textbox" that allows you to write bold/italics/colored text: https://huggingface.co/spaces/abidlabs/gradio_rich_textbox
* A "Folium Map Viewer" component that allows you to use interactive maps: https://huggingface.co/spaces/freddyaboulton/gradio_folium

You can see the source code for those components by clicking the "Files" icon and then clicking "src". The complete source code for the backend and frontend is visible. If you'd like to get started, we've put together a Guide: https://www.gradio.app/guides/five-minute-guide, and we're happy to help.

jumutc commented 1 year ago

My usecase boils down to a few essential missing parts of gr.AnnotatedImage:

In have solved my usecase by introducing a click event in https://github.com/gradio-app/gradio/pull/6501 for masks such that I can capture the coordinates and select needed annotations based on it. There might be more efficient ways of doing so, like some capture events where one can select an area of interest within a mask/image.

aliabid94 commented 1 year ago

What you're saying makes a lot of sense @jumutc! The one thing I'd change is that, currently we usually only send event data through .select events. I think it'd make more sense to expand the SelectData in this case to add another property called coordinates that carries the image coordinates. That way it's still the .select event listener that sends data. We can expand the SelectData type to take [str]: any for any additional properties

aliabid94 commented 1 year ago

If you'd be able to update your previous PR to incorporate this into .select, that'd be awesome

jumutc commented 1 year ago

Sure I can try to do that @aliabid94. Although .click event seemed more natural to me in this case and it is also handled completely separately and in a different way than .select.

jumutc commented 1 year ago

@aliabid94 the change to my PR is easy but it introduces problems for differentiating between two .select events coming from clicking on a label and an image. In my usecase I would like to handle only one of them (clicking on an image) and keep the default JS behaviour of label clicking (with client-restricted highlighting only). Every time refreshing the gr.AnnotatedImage component seems to me as a bad idea.

SharkWipf commented 3 months ago

I'm new to Gradio so I might be mistaken, but I believe this kind of feature is exactly what's needed for smooth SAM-2 support.
SAM-2 has you place positive and negative points* on images/frames and generates "masklets" based off of that.
So you would want the ability to register and annotate clicks at coordinates as well as masks.
As far as I can tell, there currently doesn't seem to be a sensible way, if any, in Gradio to do such a thing.

* There also is no way to actually place points with AnnotatedImage, but drawing small squares could probably be enough there.

As a sidenote: It seems a bit strange to me that there are 3 different image components, each with completely different subsets of features, even when said features would make just as much sense on the one as on another. Wouldn't it be more sensible to have a single unified dynamic image component, with features you can enable as needed?