Support Message API for chatbot and chatinterface

freddyaboulton commented 4 months ago

Description

Main Changes:

Adds a msg_format parameter to Chatbot and ChatInterface so that messages can be returned as either the current list of tuples or a dictionary that is a superset of the messages/"OAI" format, e.g. {"role": "user", "content": ""}.
Adds a ChatMessage dataclass that can be used instead of a dictionary when `msg_format="messages". This is nice because the IDE will autocomplete.
~If msg_format is "messages", then in ChatInterface developers can just yield the next token. They don't have to yield the entire message up to and including that token. I think this makes demos easier to write. And lets developers simply yield from their iterator.~
Added the ability to parametrize e2e tests. If you create a python file that ends with _testcase.py in a demo directory corresponding to an e2e test, that demo will also be loaded to the e2e app. And you can use go_to_testcase to navigate to that testcase.

Messages format overview

The message format is a dict with two required keys role and content. There is an additional metadata key that is not required but it can be used for tools and additional info about the message. Most messages returned from an "openai compatible" client will be compatible with gradio.

The implementation of the message is below:

class Metadata(GradioModel):
    title: Optional[str] = None

class Message(GradioModel):
    role: str
    metadata: Metadata = Field(default_factory=Metadata)
    content: str | FileData

Examples

Realistic Inference API Streaming

from huggingface_hub import InferenceClient
import gradio as gr

"""
For more information on `huggingface_hub` Inference API support, please check the docs: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/inference
"""
client = InferenceClient("HuggingFaceH4/zephyr-7b-beta")

def respond(
    prompt: str,
    history,
):
    if not history:
        history = [{"role": "system", "content": "You are a friendly chatbot"}]
    history.append({"role": "user", "content": prompt})

    yield history

    response = {"role": "assistant", "content": ""}
    for message in client.chat_completion(
        history,
        temperature=0.95,
        top_p=0.9,
        max_tokens=512,
        stream=True,
    ):
        response["content"] += message.choices[0].delta.content or ""

        yield history + [response]

with gr.Blocks() as demo:
    gr.Markdown("# Chat with Hugging Face Zephyr 7b 🤗")
    chatbot = gr.Chatbot(
        label="Agent",
        msg_format="messages",
        avatar_images=(
            None,
            "https://em-content.zobj.net/source/twitter/376/hugging-face_1f917.png",
        ),
    )
    prompt = gr.Textbox(lines=1, label="Chat Message")
    prompt.submit(respond, [prompt, chatbot], [chatbot])

if __name__ == "__main__":
    demo.launch()

ChatInterface Streaming

import time
import gradio as gr

def slow_echo(message, history):
    yield f"You typed: "
    for i in range(len(message)):
        time.sleep(0.05)
        yield message[i]

demo = gr.ChatInterface(slow_echo, msg_format="messages").queue()

if __name__ == "__main__":
    demo.launch()

ChatInterface Multimodal

import gradio as gr

def echo(message, history):
    return message["text"]

demo = gr.ChatInterface(
    fn=echo,
    examples=[{"text": "hello"}, {"text": "hola"}, {"text": "merhaba"}],
    title="Echo Bot",
    multimodal=True,
    msg_format="messages",
)
demo.launch()

Chatbot

import gradio as gr
import random
import time

with gr.Blocks() as demo:
    chatbot = gr.Chatbot(msg_format="openai")
    msg = gr.Textbox()
    clear = gr.ClearButton([msg, chatbot])

    def respond(message, chat_history: list):
        bot_message = random.choice(["How are you?", "I love you", "I'm very hungry"])
        chat_history.extend([{"role": "user", "content": message}, {"role": "assistant", "content": bot_message}])
        time.sleep(2)
        return "", chat_history

    msg.submit(respond, [msg, chatbot], [msg, chatbot])

if __name__ == "__main__":
    demo.launch()

Multimodal Chatbot

import gradio as gr
import time

def add_message(history, message):
    for x in message["files"]:
        history.append({"role": "user", "content": {"path": x}})
    if message["text"] is not None:
        history.append({"role": "user", "content": message["text"]})
    return history, gr.MultimodalTextbox(value=None, interactive=False)

def bot(history: list):
    response = "**That's cool!**"
    history.append({"role": "assistant", "content": ""})
    for character in response:
        history[-1]['content'] += character
        time.sleep(0.05)
        yield history

with gr.Blocks() as demo:
    chatbot = gr.Chatbot(
        [],
        elem_id="chatbot",
        bubble_full_width=False,
        msg_format="messages"
    )

    chat_input = gr.MultimodalTextbox(interactive=True, file_types=["image"], placeholder="Enter message or upload file...", show_label=False)

    chat_msg = chat_input.submit(add_message, [chatbot, chat_input], [chatbot, chat_input])
    bot_msg = chat_msg.then(bot, chatbot, chatbot, api_name="bot_response")
    bot_msg.then(lambda: gr.MultimodalTextbox(interactive=True), None, [chat_input])

demo.queue()
if __name__ == "__main__":
    demo.launch()

Closes: https://github.com/gradio-app/gradio/issues/7118

gradio-pr-bot commented 4 months ago

🪼 branch checks and previews

•	Name	Status	URL
	Spaces	ready!	Spaces preview
	Website	ready!	Website preview
	Storybook	ready!	Storybook preview
:unicorn:	Changes	detected!	Details

Install Gradio from this PR

pip install https://gradio-builds.s3.amazonaws.com/96d6e61c927fcf15374934cfde976c0a25000db3/gradio-4.37.2-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@96d6e61c927fcf15374934cfde976c0a25000db3#subdirectory=client/python"

Install Gradio JS Client from this PR

npm install https://gradio-builds.s3.amazonaws.com/96d6e61c927fcf15374934cfde976c0a25000db3/gradio-client-1.2.1.tgz

gradio-pr-bot commented 4 months ago

🦄 change detected

This Pull Request includes changes to the following packages.

Package	Version
`@gradio/chatbot`	`minor`
`@gradio/tootils`	`minor`
`gradio`	`minor`
`website`	`minor`

With the following changelog entry.

Support message format in chatbot 💬

gr.Chatbot and gr.ChatInterface now support the Messages API, which is fully compatible with LLM API providers such as Hugging Face Text Generation Inference, OpenAI's chat completions API, and Llama.cpp server.

Building Gradio applications around these LLM solutions is now even easier!

gr.Chatbot and gr.ChatInterface now have a msg_format parameter that can accept two values - 'tuples' and 'messages'. If set to 'tuples', the default chatbot data format is expected. If set to 'messages', a list of dictionaries with content and role keys is expected. See below -
def chat_greeter(msg, history):
    history.append({"role": "assistant", "content": "Hello!"})
    return history
Additionally, gradio now exposes a gr.ChatMessage dataclass you can use for IDE type hints and auto completion.

Tool use in Chatbot 🛠️

The Gradio Chatbot can now natively display tool usage and intermediate thoughts common in Agent and chain-of-thought workflows!

If you are using the new "messages" format, simply add a metadata key with a dictionary containing a title key and value. This will display the assistant message in an expandable message box to show the result of a tool or intermediate step.
import gradio as gr
from gradio import ChatMessage
import time

def generate_response(history):
    history.append(ChatMessage(role="user", content="What is the weather in San Francisco right now?"))
    yield history
    time.sleep(0.25)
    history.append(ChatMessage(role="assistant",
                               content="In order to find the current weather in San Francisco, I will need to use my weather tool.")
                               )
    yield history
    time.sleep(0.25)

    history.append(ChatMessage(role="assistant",
                               content="API Error when connecting to weather service.",
                              metadata={"title": "💥 Error using tool 'Weather'"})
                  )
    yield history
    time.sleep(0.25)

    history.append(ChatMessage(role="assistant",
                               content="I will try again",
                              ))
    yield history
    time.sleep(0.25)

    history.append(ChatMessage(role="assistant",
                               content="Weather 72 degrees Fahrenheit with 20% chance of rain.",
                                metadata={"title": "🛠️ Used tool 'Weather'"}
                              ))
    yield history
    time.sleep(0.25)

    history.append(ChatMessage(role="assistant",
                               content="Now that the API succeeded I can complete my task.",
                              ))
    yield history
    time.sleep(0.25)

    history.append(ChatMessage(role="assistant",
                               content="It's a sunny day in San Francisco with a current temperature of 72 degrees Fahrenheit and a 20% chance of rain. Enjoy the weather!",
                              ))
    yield history

with gr.Blocks() as demo:
    chatbot  = gr.Chatbot(msg_format="messages")
    button = gr.Button("Get San Francisco Weather")
    button.click(generate_response, chatbot, chatbot)

if __name__ == "__main__":
    demo.launch()

⚠️ The changeset file for this pull request has been modified manually, so the changeset generation bot has been disabled. To go back into automatic mode, delete the changeset file.

#### Something isn't right?

- Maintainers can change the version label to modify the version bump. - If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can [update the changelog file directly](https://github.com/gradio-app/gradio/edit/openai-message-format/.changeset/young-crabs-begin.md).

abidlabs commented 3 months ago

@freddyaboulton this looks great! Just one quibble: I'd suggest not using "openai" as the name of the format. It might be that they modify their message format in a way that we don't want to track. Also Anthropic and others use the same format as well. What about "tuples" | "dicts"?

So thinking how we could add support for components into this: we could modify the content key to accept Component as well?

class Message(GradioModel):
    role: str
    metadata: Metadata = Field(default_factory=Metadata)
    content: str | FileData | Component

wdyt @dawoodkhan82 @freddyaboulton

freddyaboulton commented 3 months ago

Yes makes sense regarding renaming. What about "messages" (the name used by tgi/transformers and from the looks of it is actually the industry standard name transformers docs anthropic docs )

abidlabs commented 3 months ago

"messages" sounds good, compatibility with transformers/tgi makes more sense 👍

dawoodkhan82 commented 3 months ago

This format looks good.

So thinking how we could add support for components into this: we could modify the content key to accept Component as well?

Regrading this, it would have to be another dict ComponentMessage or ComponentContent which stores the component name, the processed value, and the constructor_args.

class ComponentMessage(GradioModel):
    component: str
    value: Any
    constructor_args: List[Dict[str, Any]]

abidlabs commented 3 months ago

imo it would be much nicer DX if the ComponentMessage class was only used internally and the developer could just pass in a Component object, e.g. gr.Gallery([..., ...])

freddyaboulton commented 3 months ago

Yes that is what I had in mind, we can handle the conversion from component instance to internal payload format in pre/postprocess

abidlabs commented 2 months ago

Nits:

Not sure if related to this PR, but noticed a UI issue when multiple user messages are one after another:

import gradio as gr

with gr.Blocks() as demo:
    gr.Chatbot([
        gr.ChatMessage(role="user", content="Hello!"),
        gr.ChatMessage(role="user", content="Hello!")        
    ], msg_format="messages")

demo.launch()

Any non-"user" role is treated like an "assistant" just fyi:

import gradio as gr

with gr.Blocks() as demo:
    gr.Chatbot([
        gr.ChatMessage(role="abc", content="Hello!"),
    ], msg_format="messages")

demo.launch()

Is it expected that this doesn't work? No bot message is printed for me

import gradio as gr

demo = gr.ChatInterface(lambda x,y:x, msg_format="messages")

demo.launch()

@freddyaboulton very nice PR! Made a first pass and left some comments above. Down to do another deeper review again once these comments are addressed

freddyaboulton commented 2 months ago

Thanks for the review @abidlabs !! I think I got all of the comments (and added some more unit tests because of them :) )

abidlabs commented 2 months ago

(1) The first concerns this:

If msg_format is "messages", then in ChatInterface developers can just yield the next token. They don't have to yield the entire message up to and including that token. I think this makes demos easier to write. And lets developers simply yield from their iterator.

Although I agree that this makes demos easier to write, this introduces a different behavior for iterators in the very special case where you are iterating from a ChatInterface with msg_format="messages". This is likely to confuse users who are used to sending the complete message with yield in all other cases. It also could lead to bugs. For example, if you run

python demo/chatinterface_streaming_echo/messages_testcase.py

and then use it via the client, e.g.

from gradio_client import Client

client = Client("http://127.0.0.1:7864/")
result = client.predict(
        message="Hello!!",
        api_name="/chat"
)
print(result)

You only get "!" (the final token). But if you run the regular version of this demo (with msg_format="tuples"), you get the entire final string: "You typed: Hello!!". This introduces a discrepancy between what a user would observe if they used the Gradio UI and what you get when you make a prediction with the client.

abidlabs commented 2 months ago

(2) Just to reiterate the earlier point about the design of tools, I think we can improve the UI quite a bit

We should make the color of the tool "box" match the color of the messages
We should provide some visual indication that the tool "box" can be clicked to expand the message

On the second point, I think an accordion would be the best UI. This is one area I like Streamlit's UI. If its an accordion, we should keep the accordion open if the accordion is the final message, but then collapse it if there are subsequent messages.

cc @pngwn @hannahblair on this front. This isn't a blocker but I think having a nice UI for tools will facilitate some nice viral comms down the road

abidlabs commented 2 months ago

(3) Let's add some docs for this, perhaps in the chatbot/chatinterface guides. Excited to do some nice comms here!

abidlabs commented 2 months ago

Discussed with @freddyaboulton and we don't need this:

We should make the color of the tool "box" match the color of the messages

if we are bringing the bubbles back. @pngwn perhaps you could review the design of the chatbot_with_tools demo after you revert the bubbles back to ensure it looks good in both light and dark mode.

freddyaboulton commented 2 months ago

Thanks everyone for the reviews! I addressed all comments and updated the chatbot docs and release notes. Will prepare a guide soon.

gradio-app / gradio