Closed lenaxia closed 6 months ago
I tried your first example on local-ai:v2.15.0-hipblas-ffmpeg
and initially ran into an issue...
{
"created": 1715667028,
"object": "chat.completion",
"id": "52e16621-7a92-496b-b9f3-46e832775765",
"model": "llama-3-8b-instruct-coder",
"choices": [
{
"index": 0,
"finish_reason": "tool_calls",
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"index": 0,
"id": "52e16621-7a92-496b-b9f3-46e832775765",
"type": "function",
"function": {
"name": "archival_memory_insert",
"arguments": "{\"content\":\"Hello, world!\",\"request_heartbeat\":true}"
}
}
]
}
}
],
"usage": {
"prompt_tokens": 1236,
"completion_tokens": 28,
"total_tokens": 1264
}
}
I had to add this to the yaml file...
function:
# set to true to allow the model to call multiple functions in parallel
parallel_calls: true
that helped a little, but then I ran into another issue...
{
"created": 1715679270,
"object": "chat.completion",
"id": "a262c700-2333-4bf9-9cb8-5d35964b385e",
"model": "llama-3-8b-instruct-coder",
"choices": [
{
"index": 0,
"finish_reason": "",
"message": {
"role": "assistant",
"content": "```\n{\n \"name\": \"archival_memory_insert\",\n \"arguments\": {\n \"content\": \"Hello, world!\",\n \"request_heartbeat\": true\n }\n}\n```\nResponse:\n```\n{\"result\": \"Archival memory inserted successfully\"}\n```\nPlease note that the actual response format and content may vary depending on the function being called."
}
}
],
"usage": {
"prompt_tokens": 1236,
"completion_tokens": 28,
"total_tokens": 1264
}
}
I didn't really know what this meant either, so I checked the logs, seems its throwing an error now...
localai-api-1 | 9:35AM INF [llama-cpp] Loads OK
localai-api-1 | 9:35AM DBG GRPC(Llama-3-8B-Instruct-Coder-Q4_K_M.gguf-127.0.0.1:39859): stdout {"timestamp":1715679307,"level":"INFO","function":"launch_slot_with_data","line":887,"message":"slot is processing task","slot_id":0,"task_id":0}
localai-api-1 | 9:35AM DBG GRPC(Llama-3-8B-Instruct-Coder-Q4_K_M.gguf-127.0.0.1:39859): stdout {"timestamp":1715679307,"level":"INFO","function":"update_slots","line":1787,"message":"kv cache rm [p0, end)","slot_id":0,"task_id":0,"p0":0}
localai-api-1 | 9:35AM DBG GRPC(Llama-3-8B-Instruct-Coder-Q4_K_M.gguf-127.0.0.1:39859): stdout {"timestamp":1715679317,"level":"INFO","function":"print_timings","line":334,"message":"prompt eval time = 7660.65 ms / 1236 tokens ( 6.20 ms per token, 161.34 tokens per second)","slot_id":0,"task_id":0,"t_prompt_processing":7660.655,"num_prompt_tokens_processed":1236,"t_token":6.197940938511326,"n_tokens_second":161.34390597148678}
localai-api-1 | 9:35AM DBG GRPC(Llama-3-8B-Instruct-Coder-Q4_K_M.gguf-127.0.0.1:39859): stdout {"timestamp":1715679317,"level":"INFO","function":"print_timings","line":348,"message":"generation eval time = 2694.03 ms / 28 runs ( 96.22 ms per token, 10.39 tokens per second)","slot_id":0,"task_id":0,"t_token_generation":2694.032,"n_decoded":28,"t_token":96.21542857142857,"n_tokens_second":10.39334350891155}
localai-api-1 | 9:35AM DBG GRPC(Llama-3-8B-Instruct-Coder-Q4_K_M.gguf-127.0.0.1:39859): stdout {"timestamp":1715679317,"level":"INFO","function":"print_timings","line":357,"message":" total time = 10354.69 ms","slot_id":0,"task_id":0,"t_prompt_processing":7660.655,"t_token_generation":2694.032,"t_total":10354.687}
localai-api-1 | 9:35AM DBG GRPC(Llama-3-8B-Instruct-Coder-Q4_K_M.gguf-127.0.0.1:39859): stdout {"timestamp":1715679317,"level":"INFO","function":"update_slots","line":1602,"message":"slot released","slot_id":0,"task_id":0,"n_ctx":8192,"n_past":1263,"n_system_tokens":0,"n_cache_tokens":1264,"truncated":false}
localai-api-1 | 9:35AM DBG GRPC(Llama-3-8B-Instruct-Coder-Q4_K_M.gguf-127.0.0.1:39859): stdout {"timestamp":1715679317,"level":"INFO","function":"update_slots","line":1547,"message":"all slots are idle and system prompt is empty, clear the KV cache"}
localai-api-1 | 9:35AM ERR multiple results: unable to unmarshal llm result error="json: cannot unmarshal object into Go value of type []map[string]interface {}" escapedLLMResult="{\"arguments\": {\"content\": \"Hello, world!\", \"request_heartbeat\": true}, \"function\": \"archival_memory_insert\"}"
localai-api-1 | 9:35AM DBG Function return: {"arguments": {"content": "Hello, world!", "request_heartbeat": true}, "function": "archival_memory_insert"} []
localai-api-1 | 9:35AM DBG nothing to do, computing a reply
localai-api-1 | 9:35AM DBG handleQuestion: function result did not contain a valid JSON object
localai-api-1 | 9:35AM DBG No action received from LLM, without a message, computing a reply
It would seem that function calls are still not working correctly. edit: AFAIK this model supports functions, but I can also try something like Meta-Llama-3-8B-Instruct-function-calling
to see if it still has issues...
Cheers
Thanks for testing @bunder2015 , @mudler merged some changes that enable better grammar management (#2328), and i've been testing it. However running into some issues so documenting them here.
Model: huggingface://TheBloke/NeuralHermes-2.5-Mistral-7B-GGUF/neuralhermes-2.5-mistral-7b.Q8_0.gguf Model File: https://github.com/lenaxia/home-ops-prod/blob/0df710aa61fca15ebfedb5d8448efb2a83a68e29/cluster/apps/home/localai/app/models/neuralhermes-2.5-mistral-7b.yaml
function:
# disable injecting the "answer" tool
disable_no_action: true
# This allows the grammar to also return messages
grammar_message: true
# Suffix to add to the grammar
grammar_prefix: '<tool_call>\n'
return_name_in_function_response: true
# Without grammar uncomment the lines below
# Warning: this is relying only on the capability of the
# LLM model to generate the correct function call.
#no_grammar: true
# json_regex_match: "(?s)<tool_call>(.*?)</tool_call>"
replace_results:
"<tool_call>": ""
"\'": "\""
"Processing user message.": ""
""": "\""
": True": ": \"True\""
": False": ": \"False\""
First issue is around unmarshalling the returned JSON object, it seems to be a bit fragile:
5:54AM DBG LLM result: Processing user message.
[
{
'name': 'send_message',
'arguments': {
'message': 'Hello. As instructed, I am refraining from utilizing contractions in this response. How may I serve you today, Chad?'
}
}
]
5:54AM DBG Replacing <tool_call> with
5:54AM DBG Replacing ' with "
5:54AM DBG Replacing Processing user message. with
5:54AM DBG LLM result(processed):
[
{
"name": "send_message",
"arguments": {
"message": "Hello. As instructed, I am refraining from utilizing contractions in this response. How may I serve you today, Chad?"
}
}
]
5:54AM WRN unable to unmarshal llm result error="json: cannot unmarshal array into Go value of type map[string]interface {}" escapedLLMResult="\n[\n {\n \"name\": \"send_message\",\n \"arguments\": {\n \"message\": \"Hello. As instructed, I am refraining from utilizing contractions in this response. How may I serve you today, Chad?\"\n }\n }\n]"
5:54AM DBG Function return:
[
{
"name": "send_message",
"arguments": {
"message": "Hello. As instructed, I am refraining from utilizing contractions in this response. How may I serve you today, Chad?"
}
}
] map[]
5:54AM DBG nothing function results but we had a message from the LLM
5:54AM DBG Response: {"created":1715925016,"object":"chat.completion","id":"6473e1fd-081b-4815-a612-e124ee515ebd","model":"neuralhermes-2.5-7b","choices":[{"index":0,"finish_reason":"","message":{"role":"assistant","content":"Processing user message.\n[\n {\n 'name': 'send_message',\n 'arguments': {\n 'message': 'Hello. As instructed, I am refraining from utilizing contractions in this response. How may I serve you today, Chad?'\n }\n }\n]"}}],"usage":{"prompt_tokens":3127,"completion_tokens":73,"total_tokens":3200}}
The second issue is that in your commit you suggest using the regex replacement of "\'": "\""
. However this poses a problem when strings are returned with contractions in them. As per this example below:
6:17AM DBG Replacing ' with "
6:17AM DBG Replacing Processing user message. with
6:17AM DBG Replacing " with "
6:17AM DBG Replacing : True with : "True"
6:17AM DBG Replacing : False with : "False"
6:17AM DBG Replacing <tool_call> with
6:17AM DBG LLM result(processed):
[
{
"index": 1,
"id": "1b96676c-16fd-4766-a4ef-52420",
"type": "function",
"function": {
"name": "conversation_search",
"arguments": "{\n \"query\": \"Hi\",\n \"request_heartbeat\": true\n}"
}
},
{
"index": 2,
"id": "c1ad50d6-33a2-472f-ace5-d4e48",
"type": "function",
"function": {
"name": "send_message",
"arguments": "{\n \"message\": \"Hello Chad, it"s nice to "...
file
6:17AM WRN unable to unmarshal llm result error="invalid character 's' after object key:value pair" escapedLLMResult="\n[\n {\n \"index\": 1,\n \"id\": \"1b96676c-16fd-4766-a4ef-52420\",\n \"type\": \"function\",\n \"function\": {\n \"name\": \"conversation_search\",\n \"arguments\": \"{\\n \\\"query\\\": \\\"Hi\\\",\\n \\\"request_heartbeat\\\": true\\n}\"\n }\n },\n {\n \"index\": 2,\n \"id\": \"c1ad50d6-33a2-472f-ace5-d4e48\",\n \"type\": \"function\",\n \"function\": {\n \"name\": \"send_message\",\n \"arguments\": \"{\\n \\\"message\\\": \\\"Hello Chad, it\"s nice to \"...\nfile"
6:17AM DBG Function return:
[
{
"index": 1,
"id": "1b96676c-16fd-4766-a4ef-52420",
"type": "function",
"function": {
"name": "conversation_search",
"arguments": "{\n \"query\": \"Hi\",\n \"request_heartbeat\": true\n}"
}
},
{
"index": 2,
"id": "c1ad50d6-33a2-472f-ace5-d4e48",
"type": "function",
"function": {
"name": "send_message",
"arguments": "{\n \"message\": \"Hello Chad, it"s nice to "...
file map[]
6:17AM DBG nothing function results but we had a message from the LLM
6:17AM DBG Response: {"created":1715926617,"object":"chat.completion","id":"8b66f810-66e1-4fec-8734-993472c95f88","model":"neuralhermes-2.5-7b","choices":[{"index":0,"finish_reason":"","message":{"role":"assistant","content":"Processing user message.\n[\n {\n 'index': 1,\n 'id': '1b96676c-16fd-4766-a4ef-52420',\n 'type': 'function',\n 'function': {\n 'name': 'conversation_search',\n 'arguments': '{\\n \\\"query\\\": \\\"Hi\\\",\\n \\\"request_heartbeat\\\": true\\n}'\n }\n },\n {\n 'index': 2,\n 'id': 'c1ad50d6-33a2-472f-ace5-d4e48',\n 'type': 'function',\n 'function': {\n 'name': 'send_message',\n 'arguments': '{\\n \\\"message\\\": \\\"Hello Chad, it's nice to '...\nfile"}}],"usage":{"prompt_tokens":3116,"completion_tokens":205,"total_tokens":3321}}
Claude Sonnet suggests this change to the grammar to support this:
Here's the updated JSON grammar that ensures double-quoted key-value pairs and proper escaping of special characters within value strings:
root ::= object object ::= "{" ws ( dquoted_string ":" ws value ("," ws dquoted_string ":" ws value)* )? "}" ws value ::= dquoted_string | object | array | number | ("true" | "false" | "null") ws array ::= "[" ws ( value ("," ws value)* )? "]" ws dquoted_string ::= "\"" ( [^"\\] | "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes )* "\"" ws number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? ws ws ::= ([ \t\n] ws)?
The main changes are:
- The
object
rule now expectsdquoted_string
for keys, followed by a colon:
and avalue
.- The
value
rule includesdquoted_string
as a possible value type, ensuring that all string values are double-quoted.- A new
dquoted_string
rule has been added, which defines a double-quoted string with properly escaped characters inside.The
dquoted_string
rule is defined as follows:dquoted_string ::= "\"" ( [^"\\] | "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes )* "\"" ws
This rule ensures that:
- The string starts and ends with a double quote
"
.- Within the string, any character other than double quote
"
and backslash\\
is allowed.- If a backslash
\\
is encountered, it can be followed by one of the escape sequences:
\/
for forward slash"
for double quote'
for single quoteb
for backspacef
for form feedn
for newliner
for carriage returnt
for tabu
followed by four hexadecimal digits for a Unicode escape sequenceWith these grammar rules, the JSON output will have all keys and values double-quoted, and any double quotes or other special characters within a value string will be properly escaped using the backslash
\\
escape sequence.For example, a valid JSON output according to this grammar would be:
{ "key1": "value with \"double quotes\"", "key2": "value with 'single quotes'", "key3": "value with \\ backslash", "key4": "value with \u0026 unicode escape" }
I believe the above grammar will also solve the last error I'm seeing:
6:15AM DBG LLM result(processed):
[
{
"index": 1,
"id": "e51d7275-ce6b-4e7c-b556-d447d",
"type": "function",
"function": {
"name": "conversation_search",
"arguments": {
"page": 0,
"query": "Hi",
"request_heartbeat": True
}
}
},
{
"index": 2,
"id": "3428a0a9-edbb-4eb0-8cb5-612c7d",
"type": "function",
"function": {
"name": "send_message",
"arguments": {
"message": "I’m glad we caught up 😊. My name is MemGPT, and I will try to make this conversation feel as close to talking to a human friend as possible. The initial couple of responses from me might be somewhat mechanical, but give me some time and I’ll get there. Did you have something specific you wanted to discuss today?"
}
}
}
]
6:15AM WRN unable to unmarshal llm result error="invalid character 'T' looking for beginning of value" escapedLLMResult="\n[\n {\n \"index\": 1,\n \"id\": \"e51d7275-ce6b-4e7c-b556-d447d\",\n \"type\": \"function\",\n \"function\": {\n \"name\": \"conversation_search\",\n \"arguments\": {\n \"page\": 0,\n \"query\": \"Hi\",\n \"request_heartbeat\": True\n }\n }\n },\n {\n \"index\": 2,\n \"id\": \"3428a0a9-edbb-4eb0-8cb5-612c7d\",\n \"type\": \"function\",\n \"function\": {\n \"name\": \"send_message\",\n \"arguments\": {\n \"message\": \"I’m glad we caught up 😊. My name is MemGPT, and I will try to make this conversation feel as close to talking to a human friend as possible. The initial couple of responses from me might be somewhat mechanical, but give me some time and I’ll get there. Did you have something specific you wanted to discuss today?\"\n }\n }\n }\n]"
I'm working around this issue right now with these replace strings:
": True": ": \"True\""
": False": ": \"False\""
Here's a suggestion from Claude Sonnet around catching the unmarshalling error and trying to unmarshall it into an array. This doesn't solve the problem of being tolerant towards mildly malformed JSON. I think that would require using another library like github.com/francoispqvi/gojay
which I'm not sure if you're interested in doing.
=====================
To make the code more robust and handle different cases where the LLM result can be a single object or an array of objects, we can modify the ParseFunctionCall
function in parse.go
. Here's the updated version:
func ParseFunctionCall(llmresult string, functionConfig FunctionsConfig) []FuncCallResults {
log.Debug().Msgf("LLM result: %s", llmresult)
for k, v := range functionConfig.ReplaceResults {
log.Debug().Msgf("Replacing %s with %s", k, v)
llmresult = strings.ReplaceAll(llmresult, k, v)
}
log.Debug().Msgf("LLM result(processed): %s", llmresult)
multipleResults := functionConfig.ParallelCalls
useGrammars := !functionConfig.NoGrammar
functionNameKey := "function"
if functionConfig.FunctionName {
functionNameKey = "name"
}
results := []FuncCallResults{}
returnResult := func(s string) (name, arguments string, e error) {
// As we have to change the result before processing, we can't stream the answer token-by-token (yet?)
var ss map[string]interface{}
// This prevent newlines to break JSON parsing for clients
s = utils.EscapeNewLines(s)
err := json.Unmarshal([]byte(s), &ss)
if err != nil {
log.Warn().Err(err).Str("escapedLLMResult", s).Msg("unable to unmarshal llm result")
}
log.Debug().Msgf("Function return: %s %+v", s, ss)
// The grammar defines the function name as "function", while OpenAI returns "name"
func_name, ok := ss[functionNameKey]
if !ok {
return "", "", fmt.Errorf("unable to find function name in result")
}
// Similarly, while here arguments is a map[string]interface{}, OpenAI actually want a stringified object
args, ok := ss["arguments"] // arguments needs to be a string, but we return an object from the grammar result (TODO: fix)
if !ok {
return "", "", fmt.Errorf("unable to find arguments in result")
}
d, _ := json.Marshal(args)
funcName, ok := func_name.(string)
if !ok {
return "", "", fmt.Errorf("unable to cast function name to string")
}
return funcName, string(d), nil
}
// if no grammar is used, we have to extract function and arguments from the result
if !useGrammars {
// the response is a string that we have to parse
result := make(map[string]string)
if functionConfig.ResponseRegex != "" {
// We use named regexes here to extract the function name and arguments
// obviously, this expects the LLM to be stable and return correctly formatted JSON
// TODO: optimize this and pre-compile it
var respRegex = regexp.MustCompile(functionConfig.ResponseRegex)
match := respRegex.FindStringSubmatch(llmresult)
for i, name := range respRegex.SubexpNames() {
if i != 0 && name != "" && len(match) > i {
result[name] = match[i]
}
}
// TODO: open point about multiple results and/or mixed with chat messages
// This is not handled as for now, we only expect one function call per response
functionName := result[functionNameKey]
if functionName == "" {
return results
}
} else if functionConfig.JSONRegexMatch != "" {
//re := regexp.MustCompile(`(?s)<tool_call>(.*?)</tool_call>`)
//m:= re.FindStringSubmatch(`<tool_call>{ foo barr }</tool_call>`)
// We use a regex to extract the JSON object from the response
var respRegex = regexp.MustCompile(functionConfig.JSONRegexMatch)
match := respRegex.FindStringSubmatch(llmresult)
if len(match) < 2 {
return results
}
funcName, args, err := returnResult(match[1])
if err != nil {
return results
}
return append(results, FuncCallResults{Name: funcName, Arguments: args})
} else {
funcName, args, err := returnResult(llmresult)
if err != nil {
return results
}
return append(results, FuncCallResults{Name: funcName, Arguments: args})
}
return append(results, FuncCallResults{Name: result[functionNameKey], Arguments: result["arguments"]})
}
// with grammars
// Handle the case where the LLM result is a single object or an array of objects
var ss []map[string]interface{}
s := utils.EscapeNewLines(llmresult)
err := json.Unmarshal([]byte(s), &ss)
if err != nil {
// If the LLM result is a single object, try unmarshaling it into a single map
var singleObj map[string]interface{}
err = json.Unmarshal([]byte(s), &singleObj)
if err != nil {
log.Warn().Err(err).Str("escapedLLMResult", s).Msg("unable to unmarshal llm result")
return results
}
ss = []map[string]interface{}{singleObj}
}
for _, s := range ss {
func_name, ok := s[functionNameKey]
if !ok {
continue
}
args, ok := s["arguments"]
if !ok {
continue
}
d, _ := json.Marshal(args)
funcName, ok := func_name.(string)
if !ok {
continue
}
results = append(results, FuncCallResults{Name: funcName, Arguments: string(d)})
}
return results
}
The main changes are:
returnResult
function, we changed the type of ss
from map[string]interface{}
to map[string]interface{}
to handle a single object.map[string]interface{}
(ss
). If that fails, we assume it's a single object and try to unmarshal it into a single map[string]interface{}
(singleObj
). If both fail, we log a warning and return an empty slice.ss
(which could have one or more objects) and process each object to extract the function name and arguments.With these changes, the code should be more robust and able to handle cases where the LLM result is a single object or an array of objects.
Taking another look at this, I haven't had a look at the new changes yet.
I was able to cobble together a small coder chatbot in C++ and started adding functions to it. When I tested sending the functions in the json request, I got a blank response to my first prompt, and the second prompt ran the function for no reason, stuffing the description of the of the function call into the arguments. :rofl: (edit: or maybe that's why the first one returned blank?)
{"messages":[{"content":"I am Pascal, a friendly expert in programming versed in many languages and technologies.\nWe work for Banana Technologies, we specialize in developing Linux desktop applications running on Gentoo Linux. We also specialize in AI technologies such as multimodal large language models, vision, and AI automation of computer-based tasks.\nWhen writing new code or making changes, I will print the entire file. When printing the file, I will write the file name in a comment at the top of the file. I will not omit code by using ellipses.\n(Program Requirements: None)","role":"system"},{"content":"Can you show me a hello world example in C++?\n","role":"user"}],"model":"llama-3-8b-instruct-coder","tool_choice":"auto","tools":[{"function":{"description":"Update the requirements for the project","name":"updateRequirements","parameters":{"properties":{"requirements":{"description":"The new requirements for the project","type":"string"}},"required":["requirements"],"type":"object"}},"type":"function"}]}
$ ./ai-agent
Enter a multi-line prompt (type 'SEND' to send the prompt to LocalAI, or type 'EXIT' to exit the program):
Can you tell me what our program requirements are?
SEND
Enter a multi-line prompt (type 'SEND' to send the prompt to LocalAI, or type 'EXIT' to exit the program):
Can you tell me what our program requirements are?
SEND
Our program requirements are:
`{"requirements":"The new requirements for the project"}`
Give me a few more days and I might pull git head and see if it's any better. Cheers
Fix was to add response_format: {"type": "json_object"}
to the completion request
LocalAI version: v2.13.0-cublas-cuda12-ffmpeg
Environment, CPU architecture, OS, and Version: kubernetes helm release: https://github.com/lenaxia/home-ops-prod/blob/bdb6695ba22777c8f4233caaddfc9bfd90b91372/cluster/apps/home/localai/app/helm-release.yaml
Describe the bug Doing regular chatting poses no problem, however any time a function call is defined the chat very quickly goes into a bad state where the LLM just repeats back to me what I type in, or very rarely stops responding entirely. I have tried this with Llama3, Hermes 2 Pro, Neural Hermes, lunademo, and several other models, and the behavior is more or less consistent.
To Reproduce env vars:
Deploy helm releaese: https://github.com/lenaxia/home-ops-prod/blob/bdb6695ba22777c8f4233caaddfc9bfd90b91372/cluster/apps/home/localai/app/helm-release.yaml
Load either Neural Hermes: https://github.com/lenaxia/home-ops-prod/blob/bdb6695ba22777c8f4233caaddfc9bfd90b91372/cluster/apps/home/localai/app/models/NeuralHermes-2.5-Mistral-7b.yaml
Or Llama3 Instruct: https://github.com/lenaxia/home-ops-prod/blob/2377dd84e434a6cbfea51118cea8202d5c209d13/cluster/apps/home/localai/app/models/llama3-instruct.yaml
Or lunademo from the model gallery
Run this curl command and the LLM will make a function call when it should not: https://gist.github.com/lenaxia/388082e0e98beb91f2447073d0d6cd63
Expected behavior I expect the model to properly answer the conversation and be able to maintain conversation on an ongoing basis.
Logs Verbose debug logs with several conversations that resulted in issues: https://gist.github.com/lenaxia/4b02a9cdd72470370b37a33b998b4b42
For easier reproduction I've extracted several raw requests that caused issues (including the one above in the how to reproduce section).
Example requests that caused issues. You can run these by putting them into a JSON file and running
curl $LOCALAI/v1/chat/completions -H "Content-Type: application/json" -d @<filename> | jq
Neural Hermes through MemGPT #1, makes a function call when it should not
Lunademo through Home Assistant Assist, this example results in LLM generating an entire conversation stream.
Neural Hermes through MemGPT #2, results in LLM just repeating back what the user sent
Additional context