gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
32.49k stars 2.44k forks source link

Fix chatinterface multimodal bug #9119

Closed freddyaboulton closed 1 month ago

freddyaboulton commented 1 month ago

Description

Closes: #9107

Consolidates some logic across stream and non-stream code-paths to make the code more maintainable. Also adds more e2e tests for multimodal cases

🎯 PRs Should Target Issues

Before your create a PR, please check to see if there is an existing issue for this change. If not, please create an issue before you create this PR, unless the fix is very small.

Not adhering to this guideline will result in the PR being closed.

Tests

  1. PRs will only be merged if tests pass on CI. To run the tests locally, please set up your Gradio environment locally and run the tests: bash scripts/run_all_tests.sh

  2. You may need to run the linters: bash scripts/format_backend.sh and bash scripts/format_frontend.sh

gradio-pr-bot commented 1 month ago

🪼 branch checks and previews

• Name Status URL
Spaces ready! Spaces preview
Website ready! Website preview
Storybook ready! Storybook preview
:unicorn: Changes detected! Details

Install Gradio from this PR

pip install https://gradio-pypi-previews.s3.amazonaws.com/02798ec170be7c9e8756dec24ef29c7f46fe2060/gradio-4.41.0-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@02798ec170be7c9e8756dec24ef29c7f46fe2060#subdirectory=client/python"

Install Gradio JS Client from this PR

npm install https://gradio-npm-previews.s3.amazonaws.com/02798ec170be7c9e8756dec24ef29c7f46fe2060/gradio-client-1.5.0.tgz
gradio-pr-bot commented 1 month ago

🦄 change detected

This Pull Request includes changes to the following packages.

Package Version
gradio patch

With the following changelog entry.

Fix chatinterface multimodal bug

Maintainers or the PR author can modify the PR title to modify this entry.

#### Something isn't right? - Maintainers can change the version label to modify the version bump. - If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can [update the changelog file directly](https://github.com/gradio-app/gradio/edit/9107-chatinterface-multimodal/.changeset/smart-pants-dance.md).
freddyaboulton commented 1 month ago

It doesn't have to be just text, it can be any valid Chatbot response, component instances included.

I think the issue is that chatinterface only allows the bot to send one response at a time. For sending files, it's best to return a component (e.g. gr.Image), but you can't send both a component and text in the same message. It's possible with a regular Blocks app though. So maybe we can make ChatInterface return more than one message at a time?

abidlabs commented 1 month ago

Ah right, I was confused, I thought you might be able to return {"text": ..., "files": []} as well if the chatbot is multimodal. I just need to have a clearer mental picture / we might need to improve documentation here.

I think the issue is that chatinterface only allows the bot to send one response at a time. For sending files, it's best to return a component (e.g. gr.Image), but you can't send both a component and text in the same message. It's possible with a regular Blocks app though. So maybe we can make ChatInterface return more than one message at a time?

This has been requested, although I'm not sure if its common enough to warrant adding yet another acceptable return type. Anyways, for a separate PR. This PR lgtm

freddyaboulton commented 1 month ago

Agreed re documentation revamp.