Open jonny190 opened 10 months ago
Thanks for reporting an issue.
Currently, there is an issue when using LocalAI. (see https://github.com/jekalmin/extended_openai_conversation/issues/17#issuecomment-1870627832)
Let me try this too and see if something can be done to fix. (I failed to install LocalAI before, but let me try again!)
I'm using docker compose for my LocalAI instance ` version: '3.6'
services: api: image: quay.io/go-skynet/local-ai:master-cublas-cuda12 build: context: . dockerfile: Dockerfile env_file:
I think this may have something to do with LocalAI's chat and completion templates. I customized the plugin connection to remove functions and have the simple template of "You are my smart home assistant." Then I told it "Tell me a joke" to which it replied "Tell me a joke".
But if I build a similar query via curl, I get a proper response:
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf", "messages": [{"role": "system", "content": "You are my smart home assistant."},{"role": "user", "content": " Tell me a joke."}], "temperature": 0.7 }'
{"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Sure, I can do that! Here's a joke for you: Why did the scarecrow win an award? Because he was outstanding in his field."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
I started my LocalAI container in debug mode in an SSH window and watched the logs as they came on screen. The big difference I see is this. For the HASS plug-in: 5:01PM DBG Prompt (before templating): You are my smart home assistant. Tell me a joke. 5:01PM DBG Prompt (after templating): You are my smart home assistant. Tell me a joke. 5:01PM DBG Grammar: root-0-arguments-list-item ::= "{" space "\"domain\"" space ":" space string "," space "\"service\"" space ":" space string "," space "\"service_data\"" space ":" space root-0-arguments-list-item-service-data "}" space root-0-arguments-list ::= "[" space (root-0-arguments-list-item ("," space root-0-arguments-list-item))? "]" space root-0-arguments ::= "{" space "\"list\"" space ":" space root-0-arguments-list "}" space root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"function\"" space ":" space root-0-function "}" space root-1-arguments ::= "{" space "\"message\"" space ":" space string "}" space space ::= " "? string ::= "\"" ( [^"\] | "\" (["\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) ) "\"" space root-0-arguments-list-item-service-data ::= "{" space "\"entity_id\"" space ":" space string "}" space root-1-function ::= "\"answer\"" root-0-function ::= "\"execute_services\"" root-1 ::= "{" space "\"arguments\"" space ":" space root-1-arguments "," space "\"function\"" space ":" space root-1-function "}" space root ::= root-0 | root-1 5:01PM DBG Model already loaded in memory: luna-ai-llama2-uncensored.Q4_K_M.gguf 5:01PM DBG Model 'luna-ai-llama2-uncensored.Q4_K_M.gguf' already loaded 5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 is processing [task id: 14] 5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 : kv cache rm - [0, end) 5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr 5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: prompt eval time = 74.89 ms / 15 tokens ( 4.99 ms per token, 200.30 tokens per second) 5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: eval time = 743.44 ms / 30 runs ( 24.78 ms per token, 40.35 tokens per second) 5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: total time = 818.33 ms 5:01PM DBG Function return: { "arguments": { "message": "Tell me a joke." },"function": "answer"} map[arguments:map[message:Tell me a joke.] function:answer] 5:01PM DBG nothing to do, computing a reply 5:01PM DBG Reply received from LLM: Tell me a joke. 5:01PM DBG Reply received from LLM(finetuned): Tell me a joke. 5:01PM DBG Response: {"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"theblokeluna-ai-llama2-uncensored-ggufluna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"message":{"role":"assistant","content":"Tell me a joke."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
Versus the curl command: 4:55PM DBG Prompt (before templating): You are my smart home assistant. You are you? 4:55PM DBG Template found, input modified to: Below is an instruction that describes a task. Write a response that appropriately completes the request.
You are my smart home assistant. You are you?
4:55PM DBG Prompt (after templating): Below is an instruction that describes a task. Write a response that appropriately completes the request.
You are my smart home assistant. You are you?
4:55PM DBG Model already loaded in memory: luna-ai-llama2-uncensored.Q4_K_M.gguf 4:55PM DBG Model 'luna-ai-llama2-uncensored.Q4_K_M.gguf' already loaded 4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 is processing [task id: 12] 4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 : kv cache rm - [0, end) 4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr 4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: prompt eval time = 95.88 ms / 48 tokens ( 2.00 ms per token, 500.62 tokens per second) 4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: eval time = 268.57 ms / 14 runs ( 19.18 ms per token, 52.13 tokens per second) 4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: total time = 364.45 ms 4:55PM DBG Response: {"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"theblokeluna-ai-llama2-uncensored-ggufluna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"I am your smart home assistant. How can I assist you today?"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
I also converted the curl request to Invoke-WebRequest and ran it on the local PC and it worked fine still. Just to rule out some sort of issue with remotely accessing the model and files.
{"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Sure, here's a joke for you: Why did the tomato turn red? Because it saw the salad dressing!"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
I'm in a bit over my head here, but it might be related to this?
@JonahMMay Thanks for sharing information. Have you tried curl with functions added?
curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf",
"messages": [
{
"role": "system",
"content": "You are my smart home assistant."
},
{
"role": "user",
"content": "tell me a joke"
}
],
"functions": [
{
"name": "execute_services",
"description": "Execute service of devices in Home Assistant.",
"parameters": {
"type": "object",
"properties": {
"domain": {
"description": "The domain of the service.",
"type": "string"
},
"service": {
"description": "The service to be called",
"type": "string"
},
"service_data": {
"description": "The service data object to indicate what to control.",
"type": "object"
}
},
"required": [
"domain",
"service",
"service_data"
]
}
}
],
"function_call": "auto",
"temperature": 0.7
}'
Unfortunately, I still didn't get LocalAI to work :(
If change functions to [] it appears to work fine. Trying the full code in your comment gives
{"error":{"code":500,"message":"Unrecognized schema: map[description:The service data object to indicate what to control. type:object]","type":""}}
Sorry to bother you. Could you try this again?
curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf",
"messages": [
{
"role": "system",
"content": "You are my smart home assistant."
},
{
"role": "user",
"content": "turn on livingroom light"
}
],
"functions": [
{
"name": "execute_services",
"description": "Execute service of devices in Home Assistant.",
"parameters": {
"type": "object",
"properties": {
"domain": {
"description": "The domain of the service.",
"type": "string"
},
"service": {
"description": "The service to be called",
"type": "string"
},
"service_data": {
"type": "object",
"properties": {
"entity_id": {
"type": "array",
"items": {
"type": "string",
"description": "The entity_id retrieved from available devices. It must start with domain, followed by dot character."
}
}
}
}
},
"required": [
"domain",
"service",
"service_data"
]
}
}
],
"function_call": "auto",
"temperature": 0.7
}'
Never mind. I just setup LocalAI successfully.
Awesome! I am headed out of town later today and won't be back until Friday, but I should have remote access to my systems if there's anything you'd like me to test or look at.
I also have this problem with the same results as @JonahMMay. If I remove the functions and function_call block from the example above, I do get an expected response, otherwise the response is the same as my input.
Same issue on my end with OpenAI Extended + LocalAI given a few models I’ve tried. 😞
Vanilla API requests are fine.
When trying to use this with LocalAI it just spits back at me the prompt i sent it. Please see the below Example
localai-api-1 | 4:05PM DBG Request received: localai-api-1 | 4:05PM DBG Configuration read: &{PredictionOptions:{Model:luna-ai-llama2-uncensored.Q8_0.gguf Language: N:0 TopP:1 TopK:80 Temperature:0.5 Maxtokens:150 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q8_0.gguf F16:true Threads:10 Debug:true Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:chat ChatMessage: Completion:completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString:auto functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:22 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:1024 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:} localai-api-1 | 4:05PM DBG Response needs to process functions localai-api-1 | 4:05PM DBG Parameters: &{PredictionOptions:{Model:luna-ai-llama2-uncensored.Q8_0.gguf Language: N:0 TopP:1 TopK:80 Temperature:0.5 Maxtokens:150 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q8_0.gguf F16:true Threads:10 Debug:true Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:chat ChatMessage: Completion:completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString:auto functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:22 MMap:false MMlock:false LowVRAM:false Grammar:space ::= " "? localai-api-1 | string ::= "\"" ( localai-api-1 | [^"\\] | localai-api-1 | "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) localai-api-1 | )* "\"" space localai-api-1 | root-0-arguments-list-item-service-data ::= "{" space "\"entity_id\"" space ":" space string "}" space localai-api-1 | root-0-arguments-list-item ::= "{" space "\"domain\"" space ":" space string "," space "\"service\"" space ":" space string "," space "\"service_data\"" space ":" space root-0-arguments-list-item-service-data "}" space localai-api-1 | root-0-arguments-list ::= "[" space (root-0-arguments-list-item ("," space root-0-arguments-list-item)*)? "]" space localai-api-1 | root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"function\"" space ":" space root-0-function "}" space localai-api-1 | root-1-function ::= "\"answer\"" localai-api-1 | root-0-arguments ::= "{" space "\"list\"" space ":" space root-0-arguments-list "}" space localai-api-1 | root-0-function ::= "\"execute_services\"" localai-api-1 | root-1-arguments ::= "{" space "\"message\"" space ":" space string "}" space localai-api-1 | root-1 ::= "{" space "\"arguments\"" space ":" space root-1-arguments "," space "\"function\"" space ":" space root-1-function "}" space localai-api-1 | root ::= root-0 | root-1 StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:1024 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:} localai-api-1 | 4:05PM DBG Prompt (before templating): I want you to act as smart home manager of Home Assistant. localai-api-1 | I will provide information of smart home along with a question, you will truthfully make correction or answer using information provided in one sentence in everyday language. localai-api-1 | localai-api-1 | Current Time: 2024-01-09 16:05:39.239165+00:00 localai-api-1 | localai-api-1 | Available Devices: localai-api-1 | ```csv localai-api-1 | entity_id,name,state,aliases localai-api-1 | scene.office_standard,Office Standard,2024-01-09T08:59:36.686917+00:00, localai-api-1 | scene.office_game,Office Game,2024-01-08T20:36:39.723638+00:00, localai-api-1 | light.bedroom_lamp,Bedroom Lamp,on, localai-api-1 | light.bedside_lamps,Bedside Lamps,on, localai-api-1 | light.shapes_0c36,Shapes 0C36,on, localai-api-1 | light.lines_01a4,Lines 01A4,on, localai-api-1 | climate.hallway,Hallway,heat, localai-api-1 | light.livingroom_corner_2,LivingRoom corner,unavailable, localai-api-1 | light.kitchen_right_up,Kitchen Right UP,on, localai-api-1 | light.desk_downlight,Desk Downlight,on, localai-api-1 | light.controller_rgb_2304a5,Kitchen Left Up,on, localai-api-1 | light.office_light_controller,Office Roof Lights,on,Office Main Light localai-api-1 | light.tv_cabinet,TV Cabinet,off, localai-api-1 | light.kitdownright,Kitchen Right Downlight,on, localai-api-1 | light.kitchen_left_downlight,Kitchen Left Downlight,on, localai-api-1 | light.backwall,BackWall,on, localai-api-1 | light.4,4,on, localai-api-1 | light.controller_rgb_fd2bd8,Bed Downlight,on, localai-api-1 | switch.fps_smasher,Computer,off, localai-api-1 | light.master_bedroom_table_lamp_bathroom,Master Bedroom Table Lamp Bathroom,on, localai-api-1 | light.master_bedroom_table_lamp,Master Bedroom Table Lamp Window,on, localai-api-1 | light.0x4c5bb3fffefcd9d6,En suit shower,off, localai-api-1 | light.ensuit_down,Master Bathroom,off, localai-api-1 | light.hallway,Hallway Downlights,off, localai-api-1 | light.ensuit_downlights,EnSuit Downlights,off, localai-api-1 | switch.tv_power,Air freshener,off, localai-api-1 | light.livingroom_floorlamp,Livingroom Floor Lamp,off, localai-api-1 | ``` localai-api-1 | localai-api-1 | The current state of devices is provided in available devices. localai-api-1 | Use execute_services function only for requested action, not for current states. localai-api-1 | Do not execute service without user's confirmation. localai-api-1 | Do not restate or appreciate what user says, rather make a quick inquiry. localai-api-1 | Turn Computer On localai-api-1 | 4:05PM DBG Prompt (after templating): I want you to act as smart home manager of Home Assistant. localai-api-1 | I will provide information of smart home along with a question, you will truthfully make correction or answer using information provided in one sentence in everyday language. localai-api-1 | localai-api-1 | Current Time: 2024-01-09 16:05:39.239165+00:00 localai-api-1 | localai-api-1 | Available Devices: localai-api-1 | ```csv localai-api-1 | entity_id,name,state,aliases localai-api-1 | scene.office_standard,Office Standard,2024-01-09T08:59:36.686917+00:00, localai-api-1 | scene.office_game,Office Game,2024-01-08T20:36:39.723638+00:00, localai-api-1 | light.bedroom_lamp,Bedroom Lamp,on, localai-api-1 | light.bedside_lamps,Bedside Lamps,on, localai-api-1 | light.shapes_0c36,Shapes 0C36,on, localai-api-1 | light.lines_01a4,Lines 01A4,on, localai-api-1 | climate.hallway,Hallway,heat, localai-api-1 | light.livingroom_corner_2,LivingRoom corner,unavailable, localai-api-1 | light.kitchen_right_up,Kitchen Right UP,on, localai-api-1 | light.desk_downlight,Desk Downlight,on, localai-api-1 | light.controller_rgb_2304a5,Kitchen Left Up,on, localai-api-1 | light.office_light_controller,Office Roof Lights,on,Office Main Light localai-api-1 | light.tv_cabinet,TV Cabinet,off, localai-api-1 | light.kitdownright,Kitchen Right Downlight,on, localai-api-1 | light.kitchen_left_downlight,Kitchen Left Downlight,on, localai-api-1 | light.backwall,BackWall,on, localai-api-1 | light.4,4,on, localai-api-1 | light.controller_rgb_fd2bd8,Bed Downlight,on, localai-api-1 | switch.fps_smasher,Computer,off, localai-api-1 | light.master_bedroom_table_lamp_bathroom,Master Bedroom Table Lamp Bathroom,on, localai-api-1 | light.master_bedroom_table_lamp,Master Bedroom Table Lamp Window,on, localai-api-1 | light.0x4c5bb3fffefcd9d6,En suit shower,off, localai-api-1 | light.ensuit_down,Master Bathroom,off, localai-api-1 | light.hallway,Hallway Downlights,off, localai-api-1 | light.ensuit_downlights,EnSuit Downlights,off, localai-api-1 | switch.tv_power,Air freshener,off, localai-api-1 | light.livingroom_floorlamp,Livingroom Floor Lamp,off, localai-api-1 | ``` localai-api-1 | localai-api-1 | The current state of devices is provided in available devices. localai-api-1 | Use execute_services function only for requested action, not for current states. localai-api-1 | Do not execute service without user's confirmation. localai-api-1 | Do not restate or appreciate what user says, rather make a quick inquiry. localai-api-1 | Turn Computer On localai-api-1 | 4:05PM DBG Grammar: space ::= " "? localai-api-1 | string ::= "\"" ( localai-api-1 | [^"\\] | localai-api-1 | "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) localai-api-1 | )* "\"" space localai-api-1 | root-0-arguments-list-item-service-data ::= "{" space "\"entity_id\"" space ":" space string "}" space localai-api-1 | root-0-arguments-list-item ::= "{" space "\"domain\"" space ":" space string "," space "\"service\"" space ":" space string "," space "\"service_data\"" space ":" space root-0-arguments-list-item-service-data "}" space localai-api-1 | root-0-arguments-list ::= "[" space (root-0-arguments-list-item ("," space root-0-arguments-list-item)*)? "]" space localai-api-1 | root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"function\"" space ":" space root-0-function "}" space localai-api-1 | root-1-function ::= "\"answer\"" localai-api-1 | root-0-arguments ::= "{" space "\"list\"" space ":" space root-0-arguments-list "}" space localai-api-1 | root-0-function ::= "\"execute_services\"" localai-api-1 | root-1-arguments ::= "{" space "\"message\"" space ":" space string "}" space localai-api-1 | root-1 ::= "{" space "\"arguments\"" space ":" space root-1-arguments "," space "\"function\"" space ":" space root-1-function "}" space localai-api-1 | root ::= root-0 | root-1 localai-api-1 | 4:05PM DBG Model already loaded in memory: luna-ai-llama2-uncensored.Q8_0.gguf localai-api-1 | 4:05PM DBG Model 'luna-ai-llama2-uncensored.Q8_0.gguf' already loaded localai-api-1 | 4:05PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr slot 0 is processing [task id: 2] localai-api-1 | 4:05PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr slot 0 : kv cache rm - [0, end) localai-api-1 | 4:06PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr localai-api-1 | 4:06PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr print_timings: prompt eval time = 5866.82 ms / 701 tokens ( 8.37 ms per token, 119.49 tokens per second) localai-api-1 | 4:06PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr print_timings: eval time = 50921.90 ms / 25 runs ( 2036.88 ms per token, 0.49 tokens per second) localai-api-1 | 4:06PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr print_timings: total time = 56788.71 ms localai-api-1 | 4:06PM DBG Function return: { "arguments": { "message": "Turn Computer On" } , "function": "answer"} map[arguments:map[message:Turn Computer On] function:answer] localai-api-1 | 4:06PM DBG nothing to do, computing a reply localai-api-1 | 4:06PM DBG Reply received from LLM: Turn Computer On localai-api-1 | 4:06PM DBG Reply received from LLM(finetuned): Turn Computer On localai-api-1 | 4:06PM DBG Response: {"created":1704814397,"object":"chat.completion","id":"56301b4b-9e94-4699-88a7-44e619222601","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q8_0.gguf","choices":[{"index":0,"message":{"role":"assistant","content":"Turn Computer On"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}} localai-api-1 | [192.168.4.44]:35842 200 - POST /v1/chat/completions localai-api-1 | 4:06PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr slot 0 released (727 tokens in cache)
I'm pretty sure my LocalAI is working as if i ask it how it it it replied as i'd expect