posit-dev / py-shiny

Shiny for Python
https://shiny.posit.co/py/
MIT License
1.21k stars 69 forks source link

Is there support for LangChain agents? #1610

Open kovla opened 1 month ago

kovla commented 1 month ago

ADDENDUM 20/08/24

Having posted the issue, I've been working on the agent output in the ui.Chat() component in my own application. The "support for agents", which I originally mentioned, seems to mean support for different kinds of messages in practice. If one takes a look at the documentation of LangGraph (which can be seen as a next evolution of agentic implementations of LangChain), multiple streaming outputs can be produced by it, such as streaming the entire state of the agent, state updates, LLM token from within chain steps, etc.

https://langchain-ai.github.io/langgraph/how-tos/stream-values/ https://langchain-ai.github.io/langgraph/how-tos/stream-updates/ https://langchain-ai.github.io/langgraph/how-tos/streaming-tokens/

I imagine it is cumbersome to support rendering this streaming in GUI out of the box for all kinds of output, but what could be improved, I believe, is support for various kinds of formatting for:

The key idea would be to indicate to the user, which output corresponds to each type visually, while having the option to include some metadata with the message (e.g. for tool calls: state that the tool is being called in plain language, and then include a JSON with the tool name and call parameters as a collapsible element.

There is an example with a competing library that attempts at this, though also without having native support from the library: https://huggingface.co/spaces/fehmikaya/rag_agent_langgraph/blob/main/app.py

Another competing library takes a step in this direction as well: https://www.gradio.app/guides/agents-and-tool-usage#the-metadata-key

It should be up to the user to transform the specific kind of message they want into the right format, given that there is support for various types of message formats, such as outlined above.

NB: I have yet to explore how transforming functions work in Shiny, but on the first glance it doesn't seem that there is conditionality to enable multiple types. Perhaps the transformation type could become an additional parameter for append_message*()?

ORIGINAL POST/ISSUE

Hi,

It was good to learn that Shiny has out-of-the-box support for LLM chats. I wonder whether LLM agents are also supported in that component. All the examples I've seen are straight LLM interactions with no tools (https://github.com/posit-dev/py-shiny/tree/main/examples/chat).

I tried adjusting the examples to work with an agent, but that did not work. There are no errors, the user can enter the prompt, and the AI icon appears with an empty response, and further nothing happens. Here is the code:

from shiny import App, ui
import dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain_core.prompts.chat import MessagesPlaceholder
from langchain.tools import tool
from langchain.agents import create_tool_calling_agent
from langchain.agents import AgentExecutor

dotenv.load_dotenv()
# Initialize the language model
llm = ChatOpenAI(model="gpt-4o", streaming=True)

# Define a basic tool (this one doesn't do anything, just a placeholder)
@tool
def basic_tool(input: str) -> str:
    """
    Basic tool that does nothing
    """
    return f"Processed input: {input}"

# Define a simple prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant"),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

tools = [basic_tool]

model_with_tools = llm.bind_tools(tools)

# Create the agent with no memory and a simple tool
agent = create_tool_calling_agent(llm=llm, tools=tools, prompt=prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

app_ui = ui.page_fillable(
    ui.panel_title("Hello Shiny Chat"),
    ui.chat_ui("chat"),
    fillable_mobile=True,
)

# Create a welcome message
welcome = ui.markdown(
    """
    Hi! How can I help you today?
    """
)

def server(input, output, session):
    chat = ui.Chat(id="chat", messages=[welcome])

    # Define a callback to run when the user submits a message
    @chat.on_user_submit
    async def _():
        # Get messages currently in the chat
        messages = chat.messages(format="langchain")
        # Create a response message stream
        response = agent_executor.stream({'input': messages})
        # Append the response stream into the chat
        await chat.append_message_stream(response)

app = App(app_ui, server)
ascket commented 3 weeks ago

I haven't worked with LangChain agents, but simple interaction with Mistral is possible without any problems for me (Shiny 1.0.0, Windows 11 Pro)


import os
from dotenv import load_dotenv
from langchain_mistralai.chat_models import ChatMistralAI
from shiny import App, Inputs, Outputs, Session, ui
load_dotenv()

mistral = ChatMistralAI(api_key=os.getenv("MISTRAL_API"), model="mistral-large-latest")

app_ui = ui.page_fillable(
    ui.panel_title("Hello Shiny chat"),
    ui.layout_columns(
        ui.card(
            ui.card_header("Mistral"),
            ui.chat_ui("mistral"),
        ),
    ),
    fillable_mobile=True,
)

start_messages = dict(content="Hello! How can I help you today?", role="assistant")

def server(input: Inputs, output: Outputs, session: Session):
    chatmistral = ui.Chat(id="mistral", messages=[
        start_messages,
    ])

    @chatmistral.on_user_submit
    async def chat_mistral():
        messages = chatmistral.messages(format="langchain")
        response = mistral.astream(messages)
        await chatmistral.append_message_stream(response)

app = App(app_ui, server)
kovla commented 3 weeks ago

@ascket Your code doesn't seem to be using any tools, just a model.

NB: I'll add some details to the original post to shape the question more clearly.

cpsievert commented 3 weeks ago

Hi @kovla, thanks for trying out Shiny! There are a few things going on here -- I'll try my best to break it down.

First, Chat currently doesn't have built-in support for tools, but it doesn't prevent you from using them. The reason your example currently doesn't work is that .append_message_stream() doesn't know how to correctly handle the response object, but if you transform response into a "known" format like a list of strings, it'll work:

await chat.append_message_stream(
    [x["messages"][0].content for x in response]
)

That said, looping over response in this way isn't ideal, since the entire response must be generated before the stream can start. This really matters when streaming tokens, but I'm not sure it matters for your example or for the other forms of streaming that you linked to.

If you want to stream tokens (or simply don't want to transform the response every time), it's better to let .append_message_stream() know how to find the relevant content for each message "chunk". I was able to write this to get your example working, but I haven't tried it for an actual streaming tokens context.

Before actually trying/using this yourself, let me add a few disclaimers:

  1. This won't work with the current version of Shiny due to a bug in .register(), which will get fixed in #1619
  2. This code is using an internal API (not the ._chat_normalize import), but this issue might be reason enough to make it more official/public.
from shiny.ui._chat_normalize import BaseMessageNormalizer, message_normalizer_registry

class LangchainAgentResponseNormalizer(BaseMessageNormalizer):
    # For each chunk of a .append_message_stream()
    def normalize_chunk(self, chunk):
        return chunk["messages"][0].content

    def can_normalize_chunk(self, chunk):
        return "messages" in chunk and len(chunk["messages"]) > 0

    # For .append_message()
    def normalize(self, message):
        return message["messages"][0].content

    def can_normalize(self, message):
        return "messages" in message and len(message["messages"]) > 0

message_normalizer_registry.register(
    "langchain-agents", LangchainAgentResponseNormalizer()
)

Lastly, if you're motivated to improve Chat's LangChain support, I'd be happy to consider PRs! If you have a look here, you'll notice Chat is aware of LangChain's "base" message formats, but it'd be great if we could extend that logic to be aware of other common LangChain formats.

jcheng5 commented 3 weeks ago

I haven't used LangChain's agents but have written my own tool calling support on top of LangChain models with tools. I think there's more to the problem than normalizing messages; all of the information you throw out in doing so, needs to be recovered the next time you use chat.messages(). In fact, a fundamental part of supporting streaming-with-tool-calls is taking the streaming chunks and reassembling a fully structured message out of them for insertion into the message history.

With the current state of ui.Chat, I was able to make this work by keeping my own list for the message history (i.e. never calling chat.messages()), and used ui.Chat only for UI purposes. @cpsievert we should talk about this tomorrow.

In the meantime, @kovla, maybe you can make sense of the code here:
https://github.com/jcheng5/py-sidebot/blob/20d5ce2fb13c5f8b7d4e919cba86d74ad19abcc3/query.py#L97-L117

kovla commented 3 weeks ago

@cpsievert @jcheng5 Thanks for your feedback!

I think that custom normalizers are a good solution on the input side (to the UI), where all kinds of complex output structures can be parsed (at least this was my understanding upon looking at the code). If so, this will be really helpful in tackling various kinds of output not only from LangChain, but also from its next iteration -- LangGraph (which implements a graph-based flow logic on top of LangChain). LangGraph streaming outputs can be complex in structure and also varying in type: standard chat messages (e.g. HumanMessage, etc), tool messages, but also various kinds of events happening in the graph.

However, normalization of incoming streams does not solve the problem of HTML formatting these various types of messages according to their type, meaning the output side (rendering in the UI). If I am not mistaken, transform_assistant_response() is the current solution to formatting output. In its current implementation, applied formatting seems to be uniform for all kinds of incoming messages.

It would be useful to allow for different kinds of formatting, depending on the role of the message. In my own RAG application, the whole graph is being executed in 1 to 3 minutes, with a complex logic of checking for hallucinations and grounding (implementing this: https://langchain-ai.github.io/langgraph/tutorials/rag/langgraph_self_rag/ and hopefully this: https://langchain-ai.github.io/langgraph/how-tos/streaming-tokens/). Providing users with intermediate output is essential in this situation. Specifically, one could visually separate technical AI decisions e.g. about graph routing from streaming LLM responses in intermediate nodes, as well as generation from the final node (answering the question).

If transform_assistant_response() were to accept additional arguments (**kwargs?), parametrizing HTML layout for the output could be done inside the transformer function. An alternative could be in allowing to pass different transformer functions to append_message*().

The underlying idea would be, of course, to provide the developer with tools on both the input and the output side, to accommodate the whichever AI backend tool they use, without having to implement out-of-the-box support for every new library.


@jcheng5 I did not have a chance to work with Anthropic models, but I have the impression that the piece of code you refer to, has the same purpose as BaseMessageNormalizer, no? A quick look at the docs of Anthropic API reveals that the mentioned format is used e.g. by their Vision models, :

image https://docs.anthropic.com/en/api/messages-examples#vision

There seem to be other examples of output that deviates from the usual role/content standard: https://docs.anthropic.com/en/api/streaming#example.

jcheng5 commented 3 weeks ago

The link I sent to you was misleading without further explanation (sorry, I think I was on my way out the door at the time). The Anthropic normalization stuff is actually not the interesting part, it was the structure around it, where streamed content was handled one way (yielded to the UI) and tool calls were handled a different way.

It doesn't matter now though; @cpsievert and I talked about this in person and between the two of us were able to have a much clearer picture of what you want and how to do it, especially with LangChain agents. We want to tackle this for the next release.