mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
https://localai.io
MIT License
25.78k stars 1.93k forks source link

Stuck in Function Call Loop #3882

Open daJuels opened 4 weeks ago

daJuels commented 4 weeks ago

LocalAI version: docker image: localai/localai:v2.22.0-aio-gpu-nvidia-cuda-11

Environment, CPU architecture, OS, and Version: docker on debian, intel i9, nvidia gpu

Describe the bug When using functions, AI stuck in a loop of function calls. It seems, as it does not understand the tool result. As documented everywhere after getting a [TOOL_RESULT], the model should process the result and answer to the user as assistant role and not run the same function call again and again...

I'm not sure if this is an issue of localai, the model or the chat template? I thought that maybe the tool_call_id is missing, so the model is not able to connect the tool result to the function call.

Any ideas?

To Reproduce use this api call with v1/chat/completions:

{
    "messages": [
        {
            "role": "system",
            "content": "You are an assistant that helps to transform text into special language."
        },
        {
            "role": "user",
            "content": "Transform this text: ExampleText"
        },
        {
            "role": "assistant",
            "content": "",
            "tool_calls": [
                {
                    "id": "06b05978-b3a4-463e-b21f-127bdabb4953",
                    "index": 0,
                    "type": "function",
                    "function": {
                        "name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage",
                        "arguments": "{\"text\":\"ExampleText\"}"
                    }
                }
            ]
        },
        {
            "name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage",
            "role": "tool",
            "content": "nqwprpok--ExampleText-nqwprpok",
            "tool_call_id": "06b05978-b3a4-463e-b21f-127bdabb4953"
        }
    ],
    "model": "gpt-4",
    "response_format": {
        "type": "text"
    },
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "text": {
                            "type": "string",
                            "description": "The text to transform"
                        }
                    },
                    "required": [
                        "text"
                    ],
                    "additionalProperties": false
                },
                "strict": true
            }
        }
    ],
    "tool_choice": "auto"
}

The response is now the same function call again:

{
    "created": 0,
    "object": "chat.completion",
    "id": "d925ce6d-11f6-4e79-a8c4-5fe4a321a3f6",
    "model": "gpt-4",
    "choices": [{
        "index": 0,
        "finish_reason": "tool_calls",
        "message": {
            "role": "assistant",
            "content": "",
            "tool_calls": [{
                "index": 0,
                "id": "d925ce6d-11f6-4e79-a8c4-5fe4a321a3f6",
                "type": "function",
                "function": {
                    "name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage",
                    "arguments": "{\"text\":\"ExampleText\"}"
                }
            }]
        }
    }],
    "usage": {
        "prompt_tokens": 190,
        "completion_tokens": 27,
        "total_tokens": 217
    }
}

Expected behavior The response should be from an assistant role that processes the tool/function result.

Logs

12:31PM DBG Request received: {"model":"gpt-4","language":"","translate":false,"n":0,"top_p":null,"top_k":null,"temperature":null,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"repeat_last_n":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","response_format":{"type":"text"},"size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"system","content":"You are an assistant that helps to transform text into special language."},{"role":"user","content":"Transform this text: ExampleText"},{"role":"assistant","content":"","tool_calls":[{"index":0,"id":"06b05978-b3a4-463e-b21f-127bdabb4953","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}]},{"role":"tool","name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","content":"nqwprpok--ExampleText-nqwprpok"}],"functions":null,"function_call":null,"tools":[{"type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","description":"","strict":true,"parameters":{"additionalProperties":false,"properties":{"text":{"description":"The text to transform","type":"string"}},"required":["text"],"type":"object"}}}],"tool_choice":"auto","stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"backend":"","model_base_name":""}
12:31PM DBG guessDefaultsFromFile: template already set name=gpt-4
12:31PM DBG Configuration read: &{PredictionOptions:{Model:gpt-4.gguf Language: Translate:false N:0 TopP:0xc0014bbac8 TopK:0xc0014bbad0 Temperature:0xc0014bbad8 Maxtokens:0xc0014bbb08 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0014bbb00 TypicalP:0xc0014bbaf8 Seed:0xc0014bbb20 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-4 F16:0xc0014bbac0 Threads:0xc0014bbab8 Debug:0xc001686940 Roles:map[] Embeddings:0xc0014bbb19 Backend: TemplateConfig:{Chat:{{.Input -}}
 ChatMessage:{{if eq .RoleName "user" -}}
[INST] {{.Content }} [/INST]
{{- else if .FunctionCall -}}
[TOOL_CALLS] {{toJson .FunctionCall}} [/TOOL_CALLS]
{{- else if eq .RoleName "tool" -}}
[TOOL_RESULTS] {{.Content}} [/TOOL_RESULTS]
      
{{- else -}}
{{ .Content -}}
{{ end -}} Completion:{{.Input}}
 Edit: Functions:[AVAILABLE_TOOLS] [{{range .Functions}}{"type": "function", "function": {"name": "{{.Name}}", "description": "{{.Description}}", "parameters": {{toJson .Parameters}} }}{{end}} ] [/AVAILABLE_TOOLS]{{.Input }} UseTokenizerTemplate:false JoinChatMessagesByCharacter:0xc0014c9ed0 Video: Image: Audio:} KnownUsecaseStrings:[] KnownUsecases:<nil> PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[type:text] FunctionsConfig:{DisableNoAction:true GrammarConfig:{ParallelCalls:true DisableParallelNewLines:true MixedMode:false NoMixedFreeString:false NoGrammar:true Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[(?s)\[TOOL\_CALLS\](.*)] ReplaceFunctionResults:[{Key:(?s)^[^{\[]* Value:} {Key:(?s)[^}\]]*$ Value:} {Key:(?s)\[TOOL\_CALLS\] Value:} {Key:(?s)\[\/TOOL\_CALLS\] Value:}] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0014bbaf0 MirostatTAU:0xc0014bbae8 Mirostat:0xc0014bbae0 NGPULayers:0xc0014bbb10 MMap:0xc0014bba68 MMlock:0xc0014bbb19 LowVRAM:0xc0014bbb19 Grammar: StopWords:[<|im_end|> <dummy32000> </tool_call> <|eot_id|> <|end_of_text|> </s> [/TOOL_CALLS] [/ACTIONS]] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0014bba70 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}
12:31PM DBG Response needs to process functions
12:31PM DBG Parameters: &{PredictionOptions:{Model:gpt-4.gguf Language: Translate:false N:0 TopP:0xc0014bbac8 TopK:0xc0014bbad0 Temperature:0xc0014bbad8 Maxtokens:0xc0014bbb08 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0014bbb00 TypicalP:0xc0014bbaf8 Seed:0xc0014bbb20 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-4 F16:0xc0014bbac0 Threads:0xc0014bbab8 Debug:0xc001686940 Roles:map[] Embeddings:0xc0014bbb19 Backend: TemplateConfig:{Chat:{{.Input -}}
 ChatMessage:{{if eq .RoleName "user" -}}
[INST] {{.Content }} [/INST]
{{- else if .FunctionCall -}}
[TOOL_CALLS] {{toJson .FunctionCall}} [/TOOL_CALLS]
{{- else if eq .RoleName "tool" -}}
[TOOL_RESULTS] {{.Content}} [/TOOL_RESULTS]
      
{{- else -}}
{{ .Content -}}
{{ end -}} Completion:{{.Input}}
 Edit: Functions:[AVAILABLE_TOOLS] [{{range .Functions}}{"type": "function", "function": {"name": "{{.Name}}", "description": "{{.Description}}", "parameters": {{toJson .Parameters}} }}{{end}} ] [/AVAILABLE_TOOLS]{{.Input }} UseTokenizerTemplate:false JoinChatMessagesByCharacter:0xc0014c9ed0 Video: Image: Audio:} KnownUsecaseStrings:[] KnownUsecases:<nil> PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[type:text] FunctionsConfig:{DisableNoAction:true GrammarConfig:{ParallelCalls:true DisableParallelNewLines:true MixedMode:false NoMixedFreeString:false NoGrammar:true Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[(?s)\[TOOL\_CALLS\](.*)] ReplaceFunctionResults:[{Key:(?s)^[^{\[]* Value:} {Key:(?s)[^}\]]*$ Value:} {Key:(?s)\[TOOL\_CALLS\] Value:} {Key:(?s)\[\/TOOL\_CALLS\] Value:}] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0014bbaf0 MirostatTAU:0xc0014bbae8 Mirostat:0xc0014bbae0 NGPULayers:0xc0014bbb10 MMap:0xc0014bba68 MMlock:0xc0014bbb19 LowVRAM:0xc0014bbb19 Grammar:realvalue ::= root-0
space ::= " "?
freestring ::= (
      
            [^\x00] |
            "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      
          )* space
string ::= "\"" (
            [^"\\] |
            "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
          )* "\"" space
root-0-arguments ::= "{" space "\"text\"" space ":" space string "}" space
root-0-name ::= "\"TestOpenAi_MyToolClass_TransformToSpecialLanguage\""
root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"name\"" space ":" space root-0-name "}" space
root ::= arr | realvalue
arr  ::=
  "["  (
        realvalue
    (","  realvalue)*
  )? "]"
mixedstring ::= freestring | freestring arr | freestring realvalue | realvalue | arr StopWords:[<|im_end|> <dummy32000> </tool_call> <|eot_id|> <|end_of_text|> </s> [/TOOL_CALLS] [/ACTIONS]] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0014bba70 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}
12:31PM DBG templated message for chat: You are an assistant that helps to transform text into special language.
12:31PM DBG templated message for chat: [INST] Transform this text: ExampleText [/INST]
12:31PM DBG templated message for chat: [TOOL_CALLS] [{"index":0,"id":"06b05978-b3a4-463e-b21f-127bdabb4953","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}] [/TOOL_CALLS]
12:31PM DBG templated message for chat: [TOOL_RESULTS] nqwprpok--ExampleText-nqwprpok [/TOOL_RESULTS]
12:31PM DBG Prompt (before templating): You are an assistant that helps to transform text into special language.[INST] Transform this text: ExampleText [/INST][TOOL_CALLS] [{"index":0,"id":"06b05978-b3a4-463e-b21f-127bdabb4953","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}] [/TOOL_CALLS][TOOL_RESULTS] nqwprpok--ExampleText-nqwprpok [/TOOL_RESULTS]
12:31PM DBG Template found, input modified to: [AVAILABLE_TOOLS] [{"type": "function", "function": {"name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage", "description": "", "parameters": {"additionalProperties":false,"properties":{"text":{"description":"The text to transform","type":"string"}},"required":["text"],"type":"object"} }} ] [/AVAILABLE_TOOLS]You are an assistant that helps to transform text into special language.[INST] Transform this text: ExampleText [/INST][TOOL_CALLS] [{"index":0,"id":"06b05978-b3a4-463e-b21f-127bdabb4953","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}] [/TOOL_CALLS][TOOL_RESULTS] nqwprpok--ExampleText-nqwprpok [/TOOL_RESULTS]
12:31PM DBG Prompt (after templating): [AVAILABLE_TOOLS] [{"type": "function", "function": {"name": "TestOpenAi_MyToolClass_TransformToSpecialLanguage", "description": "", "parameters": {"additionalProperties":false,"properties":{"text":{"description":"The text to transform","type":"string"}},"required":["text"],"type":"object"} }} ] [/AVAILABLE_TOOLS]You are an assistant that helps to transform text into special language.[INST] Transform this text: ExampleText [/INST][TOOL_CALLS] [{"index":0,"id":"06b05978-b3a4-463e-b21f-127bdabb4953","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}] [/TOOL_CALLS][TOOL_RESULTS] nqwprpok--ExampleText-nqwprpok [/TOOL_RESULTS]
12:31PM DBG Grammar: realvalue ::= root-0
space ::= " "?
freestring ::= (
      
            [^\x00] |
            "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      
          )* space
string ::= "\"" (
            [^"\\] |
            "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
          )* "\"" space
root-0-arguments ::= "{" space "\"text\"" space ":" space string "}" space
root-0-name ::= "\"TestOpenAi_MyToolClass_TransformToSpecialLanguage\""
root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"name\"" space ":" space root-0-name "}" space
root ::= arr | realvalue
arr  ::=
  "["  (
        realvalue
    (","  realvalue)*
  )? "]"
mixedstring ::= freestring | freestring arr | freestring realvalue | realvalue | arr
12:31PM DBG Model already loaded in memory: gpt-4
12:31PM DBG Checking model availability (gpt-4)
12:31PM DBG Model 'gpt-4' already loaded
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341086,"level":"INFO","function":"launch_slot_with_data","line":896,"message":"slot is processing task","slot_id":0,"task_id":399}
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341086,"level":"INFO","function":"update_slots","line":1795,"message":"kv cache rm [p0, end)","slot_id":0,"task_id":399,"p0":0}
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341087,"level":"INFO","function":"print_timings","line":327,"message":"prompt eval time     =      71.69 ms /   190 tokens (    0.38 ms per token,  2650.37 tokens per second)","slot_id":0,"task_id":399,"t_prompt_processing":71.688,"num_prompt_tokens_processed":190,"t_token":0.37730526315789475,"n_tokens_second":2650.373842205111}
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341087,"level":"INFO","function":"print_timings","line":341,"message":"generation eval time =     615.57 ms /    27 runs   (   22.80 ms per token,    43.86 tokens per second)","slot_id":0,"task_id":399,"t_token_generation":615.571,"n_decoded":27,"t_token":22.79892592592593,"n_tokens_second":43.861715382953385}
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341087,"level":"INFO","function":"print_timings","line":351,"message":"          total time =     687.26 ms","slot_id":0,"task_id":399,"t_prompt_processing":71.688,"t_token_generation":615.571,"t_total":687.259}
12:31PM DBG GRPC(gpt-4-127.0.0.1:46763): stdout {"timestamp":1729341087,"level":"INFO","function":"update_slots","line":1606,"message":"slot released","slot_id":0,"task_id":399,"n_ctx":8192,"n_past":216,"n_system_tokens":0,"n_cache_tokens":217,"truncated":false}
12:31PM DBG ParseTextContent: [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}]
12:31PM DBG CaptureLLMResult: []
12:31PM DBG LLM result: [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}]
12:31PM DBG LLM result(processed): [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}]
12:31PM DBG LLM result: [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}]
12:31PM DBG Replacing (?s)^[^{\[]* with 
12:31PM DBG Replacing (?s)[^}\]]*$ with 
12:31PM DBG Replacing (?s)\[TOOL\_CALLS\] with 
12:31PM DBG Replacing (?s)\[\/TOOL\_CALLS\] with 
12:31PM DBG LLM result(function cleanup): [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}]
12:31PM DBG Function return: [{"arguments":{"text":"ExampleText"},"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage"}] [map[arguments:map[text:ExampleText] name:TestOpenAi_MyToolClass_TransformToSpecialLanguage]]
12:31PM DBG Text content to return: 
12:31PM DBG Response: {"created":1729341086,"object":"chat.completion","id":"d925ce6d-11f6-4e79-a8c4-5fe4a321a3f6","model":"gpt-4","choices":[{"index":0,"finish_reason":"tool_calls","message":{"role":"assistant","content":"","tool_calls":[{"index":0,"id":"d925ce6d-11f6-4e79-a8c4-5fe4a321a3f6","type":"function","function":{"name":"TestOpenAi_MyToolClass_TransformToSpecialLanguage","arguments":"{\"text\":\"ExampleText\"}"}}]}}],"usage":{"prompt_tokens":190,"completion_tokens":27,"total_tokens":217}}
12:31PM INF Success ip=127.0.0.1 latency=691.053453ms method=POST status=200 url=/v1/chat/completions
levidehaan commented 4 days ago

you have it on tool choice auto, the model might not know it's completed its task.

daJuels commented 4 days ago

you have it on tool choice auto, the model might not know it's completed its task.

You think it is a model issue? Of course the workaround to remove the tool works for this scenario - but not with multiple tools.