vercel / ai

Build AI-powered applications with React, Svelte, Vue, and Solid
https://sdk.vercel.ai/docs
Other
8.6k stars 1.21k forks source link

Examples for function composition #1566

Open HosseinAgha opened 1 month ago

HosseinAgha commented 1 month ago

Feature Description

Here you extensively talk about function composition and how model uses a function call result to immediately call the next function. I cannot find any examples that implement this pattern using the new streamUI API.

As far as I understand a single user call should result in multiple LLM calls and there is no way for LLMs to do this automatically. I currently call submitUserMessage action recursively in execute function of the first tool.
Is this a right approach? I'd be happy to contribute the example.

Do you plan to make this automatic? For example add an allowComposition parameter to some tools?

Use Case

No response

Additional context

No response

HosseinAgha commented 1 month ago

OK, I successfully implemented function composition using streamUI by recursively calling sendMessage server action from inside the tools execute function.

This actually is the same pattern as nested streamable UIs because sendMessage server function returns a streamable that when returned from a tool's execute function would would be wrapped inside the parent streamable.

I had to patch the streamUI logic to call streamableUI.done() with latest as it looks like nested streamable UIs won't work if the parent streamable does not call done with the child streamable.
Looks like the this issue is fixed by https://github.com/vercel/ai/pull/1818

My final code is something like this:


export async function sendUserMessage(content, mutableAIState) {

  const aiState = mutableAIState ?? getMutableAIState<typeof AI>()

  const result = streamUI({
     tools: {
        getStocks: {
           execute: async function *(_, toolCallInfo) {
             yield {
                <LoadingStocks />
             }

             const { stocks } = await getStocks()

             aiState.update({
                ...aiState
                {
                  id: nanoid(),
                  role: 'assistant',
                  content: [{
                    ...toolCallInfo,
                    type: 'tool-call',
                    args,
                  }]
                },
                {
                  id: nanoid(),
                  role: 'tool',
                  content: [{
                    ...toolCallInfo,
                    type: 'tool-result',
                    result: stocks,
                  }],
                }
             })

             return (await submitUserMessage('', aiState)).display
          }
       },
       logStocks: {
           execute: async function*({ stocks }) {
               // use the previous tool's call  
           }
       } 
    }
  })

  return {
    display: result.value
  }
}
nckre commented 1 month ago

Based on what I've seen, there are two different ways in the examples, both embedded into the components rather than the functions:

1) Gemini chatbot submits messages on behalf of the user e.g. in list-flights.tsx:

<div
    key={flight.id}
    className="flex cursor-pointer flex-row items-start sm:items-center gap-4 rounded-xl p-2 hover:bg-zinc-50"
    onClick={async () => {
        const response = await submitUserMessage(
                  `The user has selected flight ${flight.airlines}, departing at ${flight.departureTime} and arriving at 
                   ${flight.arrivalTime} for $${flight.price}. Now proceeding to select seats.`
         )
         setMessages((currentMessages: any[]) => [
              ...currentMessages,
              response
         ])
     }}
 >

2) Vercel chatbot creates a system message and uses useAction to call a custom action specified in to chain a flow (e.g. stock-purchase.tsx calls confirmPurchase)

For me, the current setup is a bit confusing. There are different implementations/examples, some buggy ("cannot destructure role...") and it's not clearly documented what are the benefits of each approach (e.g. in Gemini why is is the flow to get a confirmation code split into 2 separate actions whereas everything else is part of tool calls in the default submitUserMessage?).

I understand they spin some of this together for events like the Google presentation and things are moving quickly = but at least the main chatbot could get some more consistent love. Maybe Ship24 will bring a new version?

In general, agree that it would be cool to have a simpler way to chain a few AI tool calls together e.g. in my case to (1) fetch data and display in a component & (2) get a text response describing the data w/o submitting extra user messages.