sgomez / ollama-ai-provider

Vercel AI Provider for running LLMs locally using Ollama
https://www.npmjs.com/package/ollama-ai-provider
Other
150 stars 18 forks source link

Fixing Tool Calling #11

Closed matrushka closed 3 months ago

matrushka commented 3 months ago

Existing implementation expects function call arguments as strings (instead of objects) and the roundtrip for feeding back the result requires an updated message structure.

Also updated the @ai/provider dependencies and adjusted the tests to make everything is up-to-date.

Tested with mistral and was able to get it working well

louis030195 commented 3 months ago

@matrushka great! also trying to get ollama provider with function calling to work for https://github.com/louis030195/screen-pipe desktop app

facing this with llama3.1 atm:

AI_APICallError: Invalid JSON response
    at eval (index.mjs:531:11)
    at async postToApi (index.mjs:374:14)
    at async OllamaChatLanguageModel.doGenerate (index.mjs:461:40)
    at async fn (index.mjs:1565:30)
    at async eval (index.mjs:216:22)
    at async _retryWithExponentialBackoff (index.mjs:269:12)
    at async fn (index.mjs:1555:32)
    at async eval (index.mjs:216:22)
    at async handleSendMessage (chat-list-openai-v2.tsx:142:20)Caused by: AI_TypeValidationError: Type validation failed: Value: {"model":"llama3.1","created_at":"2024-07-26T12:36:36.798629Z","message":{"role":"assistant","content":"","tool_calls":[{"function":{"name":"query_screenpipe","arguments":{"queries":[{"end_date":"2024-07-26T12:36:32Z","limit":10,"offset":0,"q":"email","start_date":"2024-07-26T12:36:32Z"}]}}}]},"done_reason":"stop","done":true,"total_duration":4465412250,"load_duration":24267417,"prompt_eval_count":1248,"prompt_eval_duration":2426543000,"eval_count":83,"eval_duration":2009040000}.
Error message: [
  {
    "code": "invalid_type",
    "expected": "string",
    "received": "object",
    "path": [
      "message",
      "tool_calls",
      0,
      "function",
      "arguments"
    ],
    "message": "Expected string, received object"
  }
]
    at safeValidateTypes (index.mjs:237:14)
    at safeParseJSON (index.mjs:280:12)
    at eval (index.mjs:525:24)
    at async postToApi (index.mjs:374:14)
    at async OllamaChatLanguageModel.doGenerate (index.mjs:461:40)
    at async fn (index.mjs:1565:30)
    at async eval (index.mjs:216:22)
    at async _retryWithExponentialBackoff (index.mjs:269:12)
    at async fn (index.mjs:1555:32)
    at async eval (index.mjs:216:22)
    at async handleSendMessage (chat-list-openai-v2.tsx:142:20)Caused by: ZodError
    at get error (index.mjs:699:31)
    at safeValidateTypes (index.mjs:239:33)
    at safeParseJSON (index.mjs:280:12)
    at eval (index.mjs:525:24)
    at async postToApi (index.mjs:374:14)
    at async OllamaChatLanguageModel.doGenerate (index.mjs:461:40)
    at async fn (index.mjs:1565:30)
    at async eval (index.mjs:216:22)
    at async _retryWithExponentialBackoff (index.mjs:269:12)
    at async fn (index.mjs:1555:32)
    at async eval (index.mjs:216:22)
    at async handleSendMessage (chat-list-openai-v2.tsx:142:20)

will test ur PR prob

happy to help anyhow or jump in a call can DM on discord/x at louis030195

sgomez commented 3 months ago

Thanks @matrushka ,

I am traveling today and will not have time to check the PR until tomorrow. Is this using the new Ollama 0.3 API?

matrushka commented 3 months ago

Tested it with ollama/ollama:0.3.0 and also weirdly 0.2.7 :)

The reference data structure I used is same in both versions (last update 2 weeks ago) https://github.com/ollama/ollama/blob/v0.3.0/server/testdata/tools/messages.json

louis030195 commented 3 months ago

@matrushka any idea how i can install this branch as deps in my package.json? tried things here but didnt work https://stackoverflow.com/questions/16350673/depend-on-a-branch-or-tag-using-a-git-url-in-a-package-json

prob because it's a sub dir

(i can't wait tomorrow for the merge)

matrushka commented 3 months ago

@matrushka any idea how i can install this branch as deps in my package.json? tried things here but didnt work https://stackoverflow.com/questions/16350673/depend-on-a-branch-or-tag-using-a-git-url-in-a-package-json

prob because it's a sub dir

(i can't wait tomorrow for the merge)

Pull the repo to your local, build it pnpm i && pnpm run build and refer to it by using it's path in your project (in my case I used ../ollama-ai-provider/packages/ollama as the built package is inside the packages directory).

louis030195 commented 3 months ago

@matrushka thx

still fails:


const screenpipeQuery = z.object({
  q: z
    .string()
    .describe(
      "The search query matching exact keywords. Use a single keyword that best matches the user intent"
    ),
  contentType: z
    .enum(["ocr", "audio", "all"])
    .describe(
      "The type of content to search for: screenshot data or audio transcriptions"
    ),
  limit: z
    .number()
    .default(5)
    .describe(
      "Number of results to return (default: 5). Don't return more than 50 results as it will be fed to an LLM"
    ),
  offset: z.number().default(0).describe("Offset for pagination (default: 0)"),
  startTime: z
    .string()
    // 1 hour ago
    .default(new Date(Date.now() - 3600000).toISOString())
    .describe("Start time for search range in ISO 8601 format"),
  endTime: z
    .string()
    .default(new Date().toISOString())
    .describe("End time for search range in ISO 8601 format"),
});
const screenpipeMultiQuery = z.object({
  queries: z.array(screenpipeQuery),
});

async function queryScreenpipeNtimes(
  params: z.infer<typeof screenpipeMultiQuery>
) {
  return Promise.all(params.queries.map(queryScreenpipe));
}

// Add this new function to handle screenpipe requests
async function queryScreenpipe(params: z.infer<typeof screenpipeQuery>) {
  try {
    const queryParams = new URLSearchParams({
      q: params.q,
      offset: params.offset.toString(),
      limit: params.limit.toString(),
      start_date: params.startTime,
      end_date: params.endTime,
      content_type: params.contentType,
    });
    console.log("calling screenpipe", JSON.stringify(params));
    const response = await fetch(`http://localhost:3030/search?${queryParams}`);
    if (!response.ok) {
      const text = await response.text();
      throw new Error(`HTTP error! status: ${response.status} ${text}`);
    }
    const result = await response.json();
    console.log("result", result);
    return result;
  } catch (error) {
    console.error("Error querying screenpipe:", error);
    return null;
  }
}

const text = await generateText({
        model: useOllama ? ollama(model) : provider(model),
        tools: {
          query_screenpipe: {
            description:
              "Query the local screenpipe instance for relevant information. You will return multiple queries under the key 'queries'.",
            parameters: screenpipeMultiQuery,
            execute: queryScreenpipeNtimes,
          },
        },
        toolChoice: "required",
        messages: [
        // ...
AI_APICallError: Invalid JSON response
    at eval (index.mjs:531:11)
    at async postToApi (index.mjs:374:14)
    at async OllamaChatLanguageModel.doGenerate (index.mjs:461:40)
    at async fn (index.mjs:1565:30)
    at async eval (index.mjs:216:22)
    at async _retryWithExponentialBackoff (index.mjs:269:12)
    at async fn (index.mjs:1555:32)
    at async eval (index.mjs:216:22)
    at async handleSendMessage (VM1132 chat-list-openai-v2.tsx:121:26)Caused by: AI_TypeValidationError: Type validation failed: Value: {"model":"llama3.1","created_at":"2024-07-26T14:43:26.232Z","message":{"role":"assistant","content":"","tool_calls":[{"function":{"name":"query_screenpipe","arguments":{"queries":[{"end_date":"2024-07-26T14:43:15Z","limit":10,"offset":0,"q":"email","start_date":"2024-07-26T14:00:00Z"}]}}}]},"done_reason":"stop","done":true,"total_duration":11143141958,"load_duration":6814941000,"prompt_eval_count":1248,"prompt_eval_duration":2309899000,"eval_count":83,"eval_duration":2013074000}.
Error message: [
  {
    "code": "invalid_type",
    "expected": "string",
    "received": "object",
    "path": [
      "message",
      "tool_calls",
      0,
      "function",
      "arguments"
    ],
    "message": "Expected string, received object"
  }
]
    at safeValidateTypes (index.mjs:237:14)
    at safeParseJSON (index.mjs:280:12)
    at eval (index.mjs:525:24)
    at async postToApi (index.mjs:374:14)
    at async OllamaChatLanguageModel.doGenerate (index.mjs:461:40)
    at async fn (index.mjs:1565:30)
    at async eval (index.mjs:216:22)
    at async _retryWithExponentialBackoff (index.mjs:269:12)
    at async fn (index.mjs:1555:32)
    at async eval (index.mjs:216:22)
    at async handleSendMessage (VM1132 chat-list-openai-v2.tsx:121:26)Caused by: ZodError
    at get error (index.mjs:699:31)
    at safeValidateTypes (index.mjs:239:33)
    at safeParseJSON (index.mjs:280:12)
    at eval (index.mjs:525:24)
    at async postToApi (index.mjs:374:14)
    at async OllamaChatLanguageModel.doGenerate (index.mjs:461:40)
    at async fn (index.mjs:1565:30)
    at async eval (index.mjs:216:22)
    at async _retryWithExponentialBackoff (index.mjs:269:12)
    at async fn (index.mjs:1555:32)
    at async eval (index.mjs:216:22)
    at async handleSendMessage (VM1132 chat-list-openai-v2.tsx:121:26)
matrushka commented 3 months ago

@louis030195 seems like you couldn't configure it properly (or maybe forgot to run npm i in your project after changing the package). I guess you'll have to try again or wait until this PR is accepted.

louis030195 commented 3 months ago

got it working!! llama3.1 function calling is on @matrushka

image

had to do this hack though:

const { textStream } = ollama
  ? await streamText({
      model: ollama(model),
      prompt: JSON.stringify([
        {
          role: "user",
          content:
            messages.findLast((msg) => msg.role === "user")?.content ||
            inputMessage,
        },
        {
          role: "assistant",
          content: text.toolCalls,
        },
        {
          role: "tool",
          content: text.toolResults,
        },
      ]),
    })
  : await streamText({
      model: provider(model),
      messages: [
        {
          role: "user",
          content:
            messages.findLast((msg) => msg.role === "user")?.content ||
            inputMessage,
        },
        {
          role: "assistant",
          content: text.toolCalls,
        },
        {
          role: "tool",
          content: text.toolResults,
        },
      ],
    });

basically go json umarshall error, suspecting the RAG data was in weird format in message so just sent a string

matrushka commented 3 months ago

I had a similar issue as well, seems like when your tool returns an object instead of a string, ollama throws that error. It went away when I encoded the tool response to JSON. Might be interesting to implement that behavior so user don't need to care.

sgomez commented 3 months ago

Hi @matrushka!

I think this is not working with the new API. I did a tool call inference as it was working because I was injecting the tools at the system prompt and was able to detect if the response contains a json to infer the tool response. This is the response from ollama with your code:

"responseMessages": [
    {
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "{\n  \"name\": \"weather\",\n  \"arguments\": {\n    \"location\": \"San Francisco\"\n  }\n} {\n  \"name\": \"cityAttractions\",\n  \"arguments\": {\n    \"city\": \"San Francisco\"\n  }\n}"
        }
      ]
    }
  ],

you can see the json does not contain the tool info but the json inside the answer

If I remove the system tool prompt injection

 "responseMessages": [
    {
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": ""
        },
        {
          "type": "tool-call",
          "toolCallId": "KuRIXth",
          "toolName": "cityAttractions",
          "args": {
            "city": "San Francisco"
          }
        },
        {
          "type": "tool-call",
          "toolCallId": "T4FEOtg",
          "toolName": "weather",
          "args": {
            "location": "San Francisco"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": [
        {
          "type": "tool-result",
          "toolCallId": "T4FEOtg",
          "toolName": "weather",
          "result": {
            "location": "San Francisco",
            "temperature": 79
          }
        }
      ]
    }
  ],

Now content is empty and the Ollama answer contains the tool in the right property.

sgomez commented 3 months ago

@matrushka I forgot to say that I have tested with ollama3.1. It may be that some models give more priority to the system prompt, than to tool suggestion. So it is likely that depending on the model it will work one way or the other. In any case, the injection prompt is no longer necessary. Do you mind removing it, or if you prefer I accept your PR as is and I'll take care of finishing it.

matrushka commented 3 months ago

@sgomez found what you mean and did the changes. Kinda saw similar results to what you said, as depending on the model it might tend to use the tool a bit less. But I think this can be improved with prompting. Tested it with mistral and it used the tools as expected (sometimes needed a bit encouragement in the prompting 😄 ).

sgomez commented 3 months ago

Thanks @matrushka! Sorry but I am in holidays and I cannot check this as fast as I'd like it.

There are some dead code now, but nothing to worries. I will remove it before the next release.