withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Force a JSON schema on the model output on the generation level
https://withcatai.github.io/node-llama-cpp/
MIT License
735 stars 63 forks source link

Function call error #204

Closed christianh104 closed 2 months ago

christianh104 commented 2 months ago

Issue description

Using functionary models with function calling enabled throws error frequently

Expected Behavior

I expect the functionary model to respond normally to queries when function calling is enabled.

Actual Behavior

Often, even without doing anything requiring the use of a function, I get the error:

LlamaFunctionCallValidationError: Function name "all" is not in the supplied functions object at FunctionCallGrammar.parseFunctionCall (file:///C:/Users/chris/Desktop/chat/node_modules/node-llama-cpp/dist/evaluator/LlamaChat/utils/FunctionCallGrammar.js:35:19) at LlamaChat.generateResponse (file:///C:/Users/chris/Desktop/chat/node_modules/node-llama-cpp/dist/evaluator/LlamaChat/LlamaChat.js:365:59) at async LlamaChatSession. (file:///C:/Users/chris/Desktop/chat/node_modules/node-llama-cpp/dist/evaluator/LlamaChatSession/LlamaChatSession.js:108:91) at async withLock (file:///C:/Users/chris/Desktop/chat/node_modules/lifecycle-utils/dist/withLock.js:36:16) at async LlamaChatSession.promptWithMeta (file:///C:/Users/chris/Desktop/chat/node_modules/node-llama-cpp/dist/evaluator/LlamaChatSession/LlamaChatSession.js:88:16) at async LlamaChatSession.prompt (file:///C:/Users/chris/Desktop/chat/node_modules/node-llama-cpp/dist/evaluator/LlamaChatSession/LlamaChatSession.js:72:34) at async file:///C:/Users/chris/Desktop/chat/test.js:24:13 { functions: { random: { description: 'Generates a random number', params: undefined, handler: [Function: handler] } }, chatWrapper: FunctionaryChatWrapper { settings: { functions: { call: { optionalPrefixSpace: true, prefix: '\n<|from|>assistant\n<|recipient|>', paramsPrefix: '\n<|content|>', suffix: '\n' }, result: { prefix: '<|from|>{{functionName}}\n<|recipient|>all\n<|content|>', suffix: '\n' } } }, wrapperName: 'Functionary' }, callText: '\n<|from|>assistant\n<|recipient|>all\n<|content|>\n\n\n\n\n' }

Steps to reproduce

import { getLlama, LlamaChatSession, defineChatSessionFunction } from 'node-llama-cpp';

const llama = await getLlama(); const model = await llama.loadModel({ modelPath: 'functionary-7b-v2.1.q4_0.gguf' }); const context = await model.createContext(); const session = new LlamaChatSession({ contextSequence: context.getSequence() }); const options = { functions: { random: defineChatSessionFunction({ description: "Generates a random number", handler () { return Math.random(); } }) } }; // This works console.log(await session.prompt('Who is Chuck Norris?', options)); // This throws an error console.log(await session.prompt('How to list files in a directory on Windows', options));

My Environment

Windows 10 Ryzen 5600X Node.js v21.7.2 node-llama-cpp v3.0.0-beta.15 https://huggingface.co/meetkai/functionary-7b-v2.1-GGUF

Additional Context

I know almost nothing about LLMs so I don't really understand what is happening under the hood, and I may be doing something wrong, but I do know how to poke around, so I checked out the functions listed in the stack trace, along with the functionary chat wrapper, and this is what I found:

There is a stop pattern in FunctionaryChatWrapper which should be stopping function evaluation mode that looks like this: "\n<|from|>assistant\n<|recipient|>all\n<|content|>".

The problem is that the first token that comes in on the response is 28705 (not shown in callText, but verified by logging in the generateResponse() loop), which according to the tokenizer.json on HuggingFace for this model is the character "▁" (Lower One Eighth Block). The first character expected is 13 (\n). Therefore this pattern is not being recognized, and function evaluation mode is not being stopped.

I've found two quick workarounds:

  1. Skip the offending token in the loop in generateResponse() for await (const token of evaluationIterator) { if (token === 28705 && generatedTokens === 0) continue; … }

  2. Add the necessary stop pattern in FunctionaryChatWrapper.js to ignoreStartText and disengageInitiallyEngaged like so: [ LlamaText("\n<|from|>assistant\n<|recipient|>all\n<|content|>"), LlamaText(new SpecialTokensText("\n<|from|>assistant\n<|recipient|>all\n<|content|>")), LlamaText("\n\n<|from|>assistant\n<|recipient|>all\n<|content|>"), LlamaText(new SpecialTokensText("\n\n<|from|>assistant\n<|recipient|>all\n<|content|>")), LlamaText("▁\n<|from|>assistant\n<|recipient|>all\n<|content|>"), LlamaText(new SpecialTokensText("▁\n<|from|>assistant\n<|recipient|>all\n<|content|>")) ]

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

No, I don’t have the time and I’m okay to wait for the community / maintainers to resolve this issue.

giladgd commented 2 months ago

I found the issue and will release a fix in the next few days. At the moment, you can use a newer Functionary model version that doesn't have this issue

github-actions[bot] commented 2 months ago

:tada: This issue has been resolved in version 3.0.0-beta.17 :tada:

The release is available on:

Your semantic-release bot :package::rocket: