Gradio chatbot greeting message interfering with parameters

PrashantSaikia commented 5 months ago

Describe the bug

Here is my app:

import gradio as gr
import ollama

model = 'kristada673/solar-10.7b-instruct-v1.0-uncensored'

def format_history(msg: str, history: list[list[str, str]], system_prompt: str):
    chat_history = [{"role": "system", "content":system_prompt}]
    for query, response in history:
        chat_history.append({"role": "user", "content": query})
        chat_history.append({"role": "assistant", "content": response})  
    chat_history.append({"role": "user", "content": msg})
    return chat_history

def generate_response(msg: str, history: list[list[str, str]], system_prompt: str, top_k: int, top_p: float, temperature: float):
    chat_history = format_history(msg, history, system_prompt)
    response = ollama.chat(model=model, stream=True, messages=chat_history, options={'top_k':top_k, 'top_p':top_p, 'temperature':temperature})
    message = ""
    for partial_resp in response:
        token = partial_resp["message"]["content"]
        message += token
        yield message

chatbot = gr.ChatInterface(
                generate_response,
                chatbot=gr.Chatbot(
                        value=[[None, "Hello, I am Charlie. Ask me anything you want."]],
                        avatar_images=["user.jpg", "chatbot.png"],
                        height="64vh"
                    ),
                additional_inputs=[
                    gr.Textbox("You are a helpful assistant and always try to answer user queries to the best of your ability.", label="System Prompt"),
                    gr.Slider(0.0,100.0, label="top_k", value=40, info="Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)"),
                    gr.Slider(0.0,1.0, label="top_p", value=0.9, info=" Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)"),
                    gr.Slider(0.0,2.0, label="temperature", value=0.4, info="The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)"),
                ],
                title="Charlie",
                theme="finlaymacklon/smooth_slate",
                submit_btn="⬅ Send",
                retry_btn="🔄 Regenerate Response",
                undo_btn="↩ Delete Previous",
                clear_btn="🗑️ Clear Chat"
)

chatbot.queue().launch()

If I remove this greeting line:

value=[[None, "Hello, I am Charlie. Ask me anything you want."]],

It works fine.

But if I have it in, I get the following error:

ollama._types.RequestError: messages must contain content

Basically, it interferes with the inputs being sent to the LLM. But I just want it to be a welcome message.

How do I have the greeting message without interfering with the rest of the app?

I have also tried putting it this way (slightly different syntax according to two different sources):

value=[(None, "Hello, I am Charlie. Ask me anything you want.")]

But the result is the same.

Have you searched existing issues? 🔎

[X] I have searched and found no existing issues

Reproduction

import gradio as gr
import ollama

model = 'kristada673/solar-10.7b-instruct-v1.0-uncensored'

def format_history(msg: str, history: list[list[str, str]], system_prompt: str):
    chat_history = [{"role": "system", "content":system_prompt}]
    for query, response in history:
        chat_history.append({"role": "user", "content": query})
        chat_history.append({"role": "assistant", "content": response})  
    chat_history.append({"role": "user", "content": msg})
    return chat_history

def generate_response(msg: str, history: list[list[str, str]], system_prompt: str, top_k: int, top_p: float, temperature: float):
    chat_history = format_history(msg, history, system_prompt)
    response = ollama.chat(model=model, stream=True, messages=chat_history, options={'top_k':top_k, 'top_p':top_p, 'temperature':temperature})
    message = ""
    for partial_resp in response:
        token = partial_resp["message"]["content"]
        message += token
        yield message

chatbot = gr.ChatInterface(
                generate_response,
                chatbot=gr.Chatbot(
                        value=[[None, "Hello, I am Charlie. Ask me anything you want."]],
                        avatar_images=["user.jpg", "chatbot.png"],
                        height="64vh"
                    ),
                additional_inputs=[
                    gr.Textbox("You are a helpful assistant and always try to answer user queries to the best of your ability.", label="System Prompt"),
                    gr.Slider(0.0,100.0, label="top_k", value=40, info="Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)"),
                    gr.Slider(0.0,1.0, label="top_p", value=0.9, info=" Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)"),
                    gr.Slider(0.0,2.0, label="temperature", value=0.4, info="The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)"),
                ],
                title="Charlie",
                theme="finlaymacklon/smooth_slate",
                submit_btn="⬅ Send",
                retry_btn="🔄 Regenerate Response",
                undo_btn="↩ Delete Previous",
                clear_btn="🗑️ Clear Chat"
)

chatbot.queue().launch()

Screenshot

No response

Logs

Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/gradio/queueing.py", line 522, in process_events
    response = await route_utils.call_process_api(
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/gradio/route_utils.py", line 260, in call_process_api
    output = await app.get_blocks().process_api(
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/gradio/blocks.py", line 1741, in process_api
    result = await self.call_function(
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/gradio/blocks.py", line 1308, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/gradio/utils.py", line 575, in async_iteration
    return await iterator.__anext__()
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/gradio/utils.py", line 701, in asyncgen_wrapper
    response = await iterator.__anext__()
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/gradio/chat_interface.py", line 545, in _stream_fn
    first_response = await async_iteration(generator)
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/gradio/utils.py", line 575, in async_iteration
    return await iterator.__anext__()
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/gradio/utils.py", line 568, in __anext__
    return await anyio.to_thread.run_sync(
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/gradio/utils.py", line 551, in run_sync_iterator_async
    return next(iterator)
  File "/Users/Admin/Documents/Uncensored LLM/Gradio App/gradio_app_2.py", line 18, in generate_response
    response = ollama.chat(model=model, stream=True, messages=chat_history, options={'top_k':top_k, 'top_p':top_p, 'temperature':temperature})
  File "/Users/Admin/miniforge3/lib/python3.10/site-packages/ollama/_client.py", line 173, in chat
    raise RequestError('messages must contain content')
ollama._types.RequestError: messages must contain content

System Info

Gradio Environment Information:
------------------------------
Operating System: Darwin
gradio version: 4.25.0
gradio_client version: 0.15.0

------------------------------------------------
gradio dependencies in your environment:

aiofiles: 23.2.1
altair: 4.2.0
fastapi: 0.108.0
ffmpy: 0.3.0
gradio-client==0.15.0 is not installed.
httpx: 0.27.0
huggingface-hub: 0.21.4
importlib-resources: 6.0.1
jinja2: 3.1.2
markupsafe: 2.1.1
matplotlib: 3.8.3
numpy: 1.26.4
orjson: 3.9.15
packaging: 23.2
pandas: 2.2.1
pillow: 9.2.0
pydantic: 2.6.3
pydub: 0.25.1
python-multipart: 0.0.9
pyyaml: 6.0.1
ruff: 0.3.5
semantic-version: 2.10.0
tomlkit==0.12.0 is not installed.
typer: 0.9.0
typing-extensions: 4.8.0
uvicorn: 0.25.0
authlib; extra == 'oauth' is not installed.
itsdangerous; extra == 'oauth' is not installed.

gradio_client dependencies in your environment:

fsspec: 2023.10.0
httpx: 0.27.0
huggingface-hub: 0.21.4
packaging: 23.2
typing-extensions: 4.8.0
websockets: 10.4

Severity

I can work around it

abidlabs commented 5 months ago

Hi @PrashantSaikia could you mock out the ollama calls so that we can have a standalone minimal repro? https://stackoverflow.com/help/minimal-reproducible-example

PrashantSaikia commented 5 months ago

I can't reproduce the issue if I remove ollama and replace its responses with random responses form a list:

import gradio as gr
import random

def format_history(msg: str, history: list[list[str, str]], system_prompt: str):
    chat_history = [{"role": "system", "content":system_prompt}]
    for query, response in history:
        chat_history.append({"role": "user", "content": query})
        chat_history.append({"role": "assistant", "content": response})  
    chat_history.append({"role": "user", "content": msg})
    return chat_history

def generate_response(msg: str, history: list[list[str, str]], system_prompt: str, top_k: int, top_p: float, temperature: float):
    chat_history = format_history(msg, history, system_prompt)
    response = random.choice(["Hi", "Hey", "Hello", "Nice to meet you", "Wassup?"])
    return response

chatbot = gr.ChatInterface(
                generate_response,
                chatbot=gr.Chatbot(
                        value=[(None, "Hello, I am Charlie. Ask me anything you want.")],
                        avatar_images=["user.jpg", "chatbot.png"],
                        height="64vh"
                    ),
                additional_inputs=[
                    gr.Textbox("You are a helpful assistant and always try to answer user queries to the best of your ability.", label="System Prompt"),
                    gr.Slider(0.0,100.0, label="top_k", value=40, info="Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)"),
                    gr.Slider(0.0,1.0, label="top_p", value=0.9, info=" Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)"),
                    gr.Slider(0.0,2.0, label="temperature", value=0.4, info="The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)"),
                ],
                title="Charlie",
                theme="finlaymacklon/smooth_slate",
                submit_btn="⬅ Send",
                retry_btn="🔄 Regenerate Response",
                undo_btn="↩ Delete Previous",
                clear_btn="🗑️ Clear Chat",
                css="footer {visibility: hidden}"
)

chatbot.queue().launch(server_name="0.0.0.0", server_port=8080)

This works perfectly without any error. With Ollama calls, the issue seems to go away if I replace None with any text, like so:

value=[("Hi", "Hello, I am Charlie. Ask me anything you want.")],

But of course, that also leads to appearing of a user message before the bot greeting. Is there any way around this issue?

abidlabs commented 5 months ago

We can try to take a look -- but in the meantime, you could also consider using a placeholder for the gr.Chatbot -- which can be an arbitrary html / markdown that is present before a user starts interacting with your chatbot.

PrashantSaikia commented 5 months ago

If I replace value=[(None, greeting)] with placeholder=greeting, it displays the greeting, but not as a chatbot message. And also, it does not persist - it goes away the moment the user enters a message.

abidlabs commented 5 months ago

Yes that's the intention of placeholder -- just offering it as a workaround until we have time to investigate this

PrashantSaikia commented 5 months ago

Alright, thanks! Also, unrelated, but could you please quickly point out how I can write a custom footer? Like, instead of "built with gradio" and displaying the API link, I want it to display some text, like "Charlie does not collect any user data or track user queries". I can't find anything in the documentation on how or where to edit the footer.

abidlabs commented 5 months ago

There isn't a specific api for this, but I would suggest wrapping the ChatInterface within a Blocks and then using gr.Markdown() to write whichever text you'd like, e.g.

with gr.Blocks() as demo:
  gr.ChatInterface(...)
  gr.Markdown(..)

demo.launch()

PrashantSaikia commented 5 months ago

But can't I just edit the html file in the backend to hardcode the footer? I can't find where its located, perhaps its a hidden file?

abidlabs commented 5 months ago

Hi @PrashantSaikia its internationalized so you'd have to edit this file: gradio/js/app/src/lang/en.json, specifically:

37,18: "built_with": "built with", 38,25: "built_with_gradio": "Built with Gradio",

I'm going to close this issue since we don't have a standalone repro for the original issue. We can reopen if we do.

gradio-app / gradio