gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
32.13k stars 2.4k forks source link

Multimodal Input TextBox #4668

Closed taoari closed 7 months ago

taoari commented 1 year ago

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Multimodal LLMs become popular nowadays. However, for multimodal input, the current gradio app has to use separate widgets for images, videos, audio, and files (attachments). The UI is super non-intuitive, it would be good to have a multimodal input textbox.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Any modern chat app has a multimodal input textbox, e.g. Slack, Teams, etc. The screenshot would be the Slack input box, it would be nice to has something similar.

image

Additional context
Add any other context or screenshots about the feature request here.

It would also be great that the gr.Chatbot can be updated accordingly that can show text, images, videos and attachments in a single message. The current version of gr.Chatbot only shows a single modality (text or image but not both in one message). PR467 https://github.com/gradio-app/gradio/issues/4667 is a bug that does show the file. It would also be great if the Chatbot can show 3D models, as there is a gr.Model3D component.

pngwn commented 1 year ago

I don't know if we'd want a full rich text input like slack but something like discord or WhatsApp might be nice. The ability to upload text along with various media (images, audio, video).

Cc @dawoodkhan82

dawoodkhan82 commented 1 year ago

@pngwn @taoari I've actually thought about this, and I think it's a good idea. Especially as more multimodal projects become popular, it would be good to have a component that supports them. I think making the chatbot a single component (input + output) also makes a lot of sense plus easier for devs to use. We can explore this in 4.0

taoari commented 1 year ago

@pngwn @dawoodkhan82 It's great to see this is on the wish list, looking forward to it.

dawoodkhan82 commented 1 year ago

@pngwn @abidlabs Do you think this should be a new component or a variant of gr.Textbox()?

abidlabs commented 1 year ago

The rich textbox should be a separate component, particularly if we want to support files, as that would involve changing the API (we could do a similar tuple format to support files).

abidlabs commented 1 year ago

Aside: it would be cool if the rich textbox could support text color so that we could address this feature request: https://github.com/gradio-app/gradio/issues/2303

dawoodkhan82 commented 1 year ago

@abidlabs we can allow file upload and the text styling features to be turned off for the rich textbox, in case a dev wants only one feature and not the other.

abidlabs commented 10 months ago

Hey! We've now made it possible for Gradio users to create their own custom components -- meaning that you can write some Python and JavaScript (Svelte), and publish it as a Gradio component. You can use it in your own Gradio apps, or share it so that anyone can use it in their Gradio apps. Here are some examples of custom Gradio components:

You can see the source code for those components by clicking the "Files" icon and then clicking "src". The complete source code for the backend and frontend is visible. In particular, its very fast if you want to build off an existing component. We've put together a Guide: https://www.gradio.app/guides/five-minute-guide, and we're happy to help. Hopefully this will help address this issue.

abidlabs commented 7 months ago

Closing this issue in favor of: https://github.com/gradio-app/gradio/issues/6976

abidlabs commented 5 months ago

Just FYI @taoari we now support a gr.MultimodalTextbox component in gradio