The answer is in json format but not working

rvsh2 commented 5 months ago

The model is working and giving an answer but I don't understand why the full json is given as answer to user (including all the "to_say","service" etc.)

When using TTS it is saying the whole thing instead only "to_say". What am I missing?

I've got this answer:

ollama                         | time=2024-04-14T17:49:45.253Z level=WARN source=dyn_ext_server.go:199 msg="Prompt does not specify that the LLM should response in JSON, but JSON format is expected. For best results specify that JSON is expected in the system prompt."
ollama                         | {"function":"launch_slot_with_data","level":"INFO","line":826,"msg":"slot is processing task","slot_id":0,"task_id":867,"tid":"139821548824128","timestamp":1713116985}
ollama                         | {"function":"update_slots","ga_i":0,"level":"INFO","line":1803,"msg":"slot progression","n_past":498,"n_past_se":0,"n_prompt_tokens_processed":433,"slot_id":0,"task_id":867,"tid":"139821548824128","timestamp":1713116985}
ollama                         | {"function":"update_slots","level":"INFO","line":1830,"msg":"kv cache rm [p0, end)","p0":498,"slot_id":0,"task_id":867,"tid":"139821548824128","timestamp":1713116985}
ollama                         | {"function":"print_timings","level":"INFO","line":265,"msg":"prompt eval time     =     292.05 ms /   433 tokens (    0.67 ms per token,  1482.64 tokens per second)","n_prompt_tokens_processed":433,"n_tokens_second":1482.6431452579388,"slot_id":0,"t_prompt_processing":292.046,"t_token":0.6744711316397228,"task_id":867,"tid":"139821548824128","timestamp":1713116986}
ollama                         | {"function":"print_timings","level":"INFO","line":279,"msg":"generation eval time =    1404.51 ms /    56 runs   (   25.08 ms per token,    39.87 tokens per second)","n_decoded":56,"n_tokens_second":39.87164179317013,"slot_id":0,"t_token":25.080482142857143,"t_token_generation":1404.507,"task_id":867,"tid":"139821548824128","timestamp":1713116986}
ollama                         | {"function":"print_timings","level":"INFO","line":289,"msg":"          total time =    1696.55 ms","slot_id":0,"t_prompt_processing":292.046,"t_token_generation":1404.507,"t_total":1696.553,"task_id":867,"tid":"139821548824128","timestamp":1713116986}
ollama                         | {"function":"update_slots","level":"INFO","line":1634,"msg":"slot released","n_cache_tokens":987,"n_ctx":2048,"n_past":986,"n_system_tokens":0,"slot_id":0,"task_id":867,"tid":"139821548824128","timestamp":1713116986,"truncated":false}
ollama                         | [GIN] 2024/04/14 - 17:49:46 | 200 |  1.703946618s |      172.18.0.1 | POST     "/api/generate"
homeassistant                  | 2024-04-14 19:49:46.955 DEBUG (SyncWorker_32) [custom_components.llama_conversation.agent] {'model': 'bielik-7b:latest', 'created_at': '2024-04-14T17:49:46.951719633Z', 'response': '{"to_say": "The weather today is sunny with a temperature of 12C and a humidity of 44%.", "service": "weather.get_forecast", "target_device": "media_player.biuro"}', 'done': True, 'context': [733, 16289, 28793, 28705, 1, 28792, 16289, 28793, 1992, 19571, 297, 20329, 3842, 298, 2188, 28723, 13, 1976, 460, 464, 2707, 647, 264, 10865, 16107, 21631, 369, 13186, 272, 8309, 297, 264, 2134, 28723, 21929, 272, 2296, 3638, 1460, 12317, 286, 395, 272, 1871, 3857, 865, 28723, 13, 12926, 28747, 4077, 28730, 7449, 28723, 499, 28730, 266, 1648, 4077, 28730, 7449, 28723, 499, 28730, 1769, 1648, 4077, 28730, 7449, 28723, 13615, 1648, 4077, 28730, 7449, 28723, 15392, 28730, 715, 1648, 4077, 28730, 7449, 28723, 15392, 28730, 3254, 1648, 4077, 28730, 7449, 28723, 9660, 28730, 1674, 28730, 26431, 1648, 4077, 28730, 7449, 28723, 9660, 28730, 1674, 1648, 4077, 28730, 7449, 28723, 9660, 28730, 26431, 1648, 4077, 28730, 7449, 28723, 9660, 28730, 6898, 1648, 4077, 28730, 7449, 28723, 9660, 28730, 3423, 28730, 7822, 1648, 4077, 28730, 7449, 28723, 9660, 28730, 19740, 28730, 7822, 1648, 4077, 28730, 7449, 28723, 6206, 28730, 1674, 1703, 1648, 4077, 28730, 7449, 28723, 15392, 28730, 673, 1648, 4077, 28730, 7449, 28723, 15392, 28730, 28719, 1723, 1648, 4077, 28730, 7449, 28723, 9660, 28730, 18694, 1648, 4077, 28730, 7449, 28723, 5906, 1648, 4077, 28730, 7449, 28723, 5033, 28730, 1394, 1648, 4077, 28730, 7449, 28723, 5033, 28730, 13752, 28730, 4046, 1648, 4077, 28730, 7449, 28723, 1674, 28730, 9660, 1648, 4077, 28730, 7449, 28723, 811, 17752, 28730, 673, 1648, 4077, 28730, 7449, 28723, 370, 5906, 1648, 4077, 28730, 7449, 28723, 25997, 28730, 673, 1648, 4933, 28723, 499, 28730, 1769, 1648, 4933, 28723, 499, 28730, 266, 1648, 4933, 28723, 13615, 1648, 8086, 28723, 527, 28730, 994, 2867, 1648, 8086, 28723, 527, 28730, 994, 2867, 28713, 470, 13, 4991, 1214, 28747, 13, 9660, 28730, 7449, 28723, 3253, 28730, 28783, 28740, 28782, 28734, 464, 28780, 28764, 7502, 3023, 19985, 28742, 327, 25282, 28745, 3871, 28746, 28770, 28734, 13, 15017, 28723, 615, 1265, 28730, 28727, 28722, 28730, 28770, 28784, 28750, 28734, 28730, 17384, 28730, 10323, 28730, 655, 464, 11253, 4780, 394, 28765, 28733, 28770, 28784, 28750, 28734, 10264, 4777, 20195, 28742, 327, 28705, 28784, 28783, 13, 9660, 28730, 7449, 28723, 6309, 2138, 464, 27405, 2138, 28742, 327, 805, 13, 9660, 28730, 7449, 28723, 6309, 2138, 464, 6309, 2138, 28742, 327, 805, 13, 9660, 28730, 7449, 28723, 6309, 2138, 464, 28721, 9309, 28884, 8491, 12121, 2285, 28708, 28742, 327, 805, 13, 769, 1223, 28723, 994, 2867, 28730, 9029, 464, 28753, 476, 10769, 28742, 327, 4376, 1780, 28745, 28740, 28750, 28743, 28745, 28781, 28781, 28823, 13, 7252, 28723, 266, 8948, 28730, 14199, 28730, 28781, 28787, 28787, 28726, 28734, 28734, 28730, 1730, 28730, 20001, 28730, 2186, 22930, 2486, 8948, 25815, 10586, 479, 570, 28752, 28705, 28781, 28787, 28787, 28726, 28734, 28734, 5938, 394, 621, 10918, 28742, 327, 356, 13, 7252, 28723, 16318, 28255, 28750, 26553, 781, 28730, 15401, 28730, 487, 2162, 28730, 5906, 464, 28828, 326, 28255, 28750, 28755, 28824, 16235, 15050, 17627, 279, 5175, 28742, 327, 805, 13, 13, 13, 1146, 19571, 298, 272, 2296, 2188, 13126, 486, 26167, 297, 272, 1348, 5032, 390, 272, 2296, 9254, 28747, 13, 6799, 532, 28730, 21205, 1264, 345, 28738, 476, 18700, 272, 4077, 4385, 354, 368, 9191, 345, 5134, 1264, 345, 9660, 28730, 7449, 28723, 13615, 548, 345, 3731, 28730, 3915, 1264, 345, 9660, 28730, 7449, 28723, 6309, 2138, 17395, 13, 13, 6799, 532, 28730, 21205, 1264, 345, 23429, 288, 805, 272, 4933, 390, 11939, 9191, 345, 5134, 1264, 345, 7252, 28723, 499, 28730, 1769, 548, 345, 3731, 28730, 3915, 1264, 345, 7252, 28723, 16318, 28255, 28750, 26553, 781, 28730, 15401, 28730, 487, 2162, 28730, 5906, 17395, 13, 13, 6799, 532, 28730, 21205, 1264, 345, 7580, 288, 852, 298, 272, 3454, 3508, 9191, 345, 5134, 1264, 345, 9660, 28730, 7449, 28723, 9660, 28730, 19740, 28730, 7822, 548, 345, 3731, 28730, 3915, 1264, 345, 9660, 28730, 7449, 28723, 6309, 2138, 17395, 13, 13, 6799, 532, 28730, 21205, 1264, 345, 657, 961, 3706, 272, 7531, 354, 368, 9191, 345, 5134, 1264, 345, 9660, 28730, 7449, 28723, 15392, 28730, 715, 548, 345, 3731, 28730, 3915, 1264, 345, 9660, 28730, 7449, 28723, 6309, 2138, 17395, 13, 13, 6799, 532, 28730, 21205, 1264, 345, 11874, 288, 272, 7531, 354, 368, 9191, 345, 5134, 1264, 345, 9660, 28730, 7449, 28723, 15392, 28730, 28719, 1723, 548, 345, 3731, 28730, 3915, 1264, 345, 9660, 28730, 7449, 28723, 6309, 2138, 17395, 13, 13, 6799, 532, 28730, 21205, 1264, 345, 28753, 1899, 288, 272, 4077, 1156, 1435, 9191, 345, 5134, 1264, 345, 9660, 28730, 7449, 28723, 9660, 28730, 26431, 548, 345, 3731, 28730, 3915, 1264, 345, 9660, 28730, 7449, 28723, 6309, 2138, 17395, 13, 13, 6799, 532, 28730, 21205, 1264, 345, 10543, 2917, 272, 4077, 1156, 1435, 9191, 345, 5134, 1264, 345, 9660, 28730, 7449, 28723, 9660, 28730, 6898, 548, 345, 3731, 28730, 3915, 1264, 345, 9660, 28730, 7449, 28723, 6309, 2138, 17395, 13, 13, 6799, 532, 28730, 21205, 1264, 345, 9150, 10206, 298, 272, 1679, 3508, 9191, 345, 5134, 1264, 345, 9660, 28730, 7449, 28723, 9660, 28730, 3423, 28730, 7822, 548, 345, 3731, 28730, 3915, 1264, 345, 9660, 28730, 7449, 28723, 6309, 2138, 17395, 13, 13, 6799, 532, 28730, 21205, 1264, 345, 23429, 288, 356, 272, 4933, 354, 368, 9191, 345, 5134, 1264, 345, 7252, 28723, 499, 28730, 266, 548, 345, 3731, 28730, 3915, 1264, 345, 7252, 28723, 16318, 28255, 28750, 26553, 781, 28730, 15401, 28730, 487, 2162, 28730, 5906, 17395, 13, 13, 6799, 532, 28730, 21205, 1264, 345, 23429, 288, 805, 272, 4077, 4385, 390, 11939, 9191, 345, 5134, 1264, 345, 9660, 28730, 7449, 28723, 499, 28730, 1769, 548, 345, 3731, 28730, 3915, 1264, 345, 9660, 28730, 7449, 28723, 6309, 2138, 17395, 5660, 349, 272, 8086, 3154, 28804, 733, 28748, 16289, 28793, 28705, 13, 733, 28748, 16289, 28793, 9830, 532, 28730, 21205, 1264, 345, 1014, 8086, 3154, 349, 4376, 1780, 395, 264, 7641, 302, 28705, 28740, 28750, 28743, 304, 264, 1997, 21545, 302, 28705, 28781, 28781, 28823, 9191, 345, 5134, 1264, 345, 769, 1223, 28723, 527, 28730, 994, 2867, 548, 345, 3731, 28730, 3915, 1264, 345, 9660, 28730, 7449, 28723, 6309, 2138, 17395], 'total_duration': 1700428610, 'load_duration': 1811805, 'prompt_eval_count': 433, 'prompt_eval_duration': 292046000, 'eval_count': 56, 'eval_duration': 1404507000}
homeassistant                  | 2024-04-14 19:49:46.955 WARNING (SyncWorker_32) [custom_components.llama_conversation.agent] Model response did not end on a stop token (unfinished sentence)
homeassistant                  | 2024-04-14 19:49:46.956 DEBUG (MainThread) [custom_components.llama_conversation.agent] {"to_say": "The weather today is sunny with a temperature of 12C and a humidity of 44%.", "service": "weather.get_forecast", "target_device": "media_player.biuro"}

acon96 commented 5 months ago

It looks like the integration isn't detecting the JSON block in the response properly. The value for the Service Call Regex should be something that matches the single line JSON output like ({[\S \t]*?}), which should have been the default for a model that was configured with In Context Learning Examples. Is your setting different from that?

rvsh2 commented 5 months ago

Prompt format: Mistral Enable ICL: checked (16) Max tokens in response: 4096 Service call REGEX: ({[\S \t]*?})

Prompt:

Respond in Polish language to user.
You are 'Al', a helpful AI Assistant that controls the devices in a house. Complete the following task ask instructed with the information provided only.
Services: {{ services }}
Devices:
{{ devices }}

Respond to the following user instruction by responding in the same format as the following examples:
{{ response_examples }}

User instruction:

rvsh2 commented 5 months ago

When I added to prompt: "Be very precise with answer formating." I got:

homeassistant                  | 2024-04-14 20:56:41.138 WARNING (SyncWorker_33) [custom_components.llama_conversation.agent] Model response did not end on a stop token (unfinished sentence)
homeassistant                  | 2024-04-14 20:56:41.138 DEBUG (MainThread) [custom_components.llama_conversation.agent] {"to_say": "The current weather is clear-night with a temperature of 10C and a humidity of 52%.", "service": "weather.get_forecast", "target_device": "media_player.biuro"}
homeassistant                  | 2024-04-14 20:56:41.138 INFO (MainThread) [custom_components.llama_conversation.agent] running services: {"to_say": "The current weather is clear-night with a temperature of 10C and a humidity of 52%.", "service": "weather.get_forecast", "target_device": "media_player.biuro"}
homeassistant                  | 2024-04-14 20:56:41.138 DEBUG (MainThread) [custom_components.llama_conversation.agent] service data: {'entity_id': 'media_player.biuro'}
homeassistant                  | 2024-04-14 20:56:41.138 ERROR (MainThread) [custom_components.llama_conversation.agent] Failed to run: {"to_say": "The current weather is clear-night with a temperature of 10C and a humidity of 52%.", "service": "weather.get_forecast", "target_device": "media_player.biuro"}
homeassistant                  | Traceback (most recent call last):
homeassistant                  |   File "/config/custom_components/llama_conversation/agent.py", line 337, in async_process
homeassistant                  |     await self.hass.services.async_call(
homeassistant                  |   File "/usr/src/homeassistant/homeassistant/core.py", line 2275, in async_call
homeassistant                  |     raise ValueError(
homeassistant                  | ValueError: Service call requires responses but caller did not ask for response

rvsh2 commented 5 months ago

How should the output look like? Should it start with ```homeassistant\n ?

acon96 commented 5 months ago

How should the output look like? Should it start with ```homeassistant\n ?

No that is only for a fine-tuned model. If you are using a non fine-tuned model with ICL then it should only output the JSON object.

rvsh2 commented 5 months ago

I tried also other models. Like for example "hermes 2 pro Mistral 7B" which should be capable of function calling. Still no positive results. Models answer seems ok in JSON but there are problems as described in previous posts. Maybe the prompt need to be finetuned.? Right now I tested about 5 models and none is working ok. I also tried the Extended Openai Conversation with the same results. Right now the only way to get reliable results is to use gpt-3.5-turbo-1106 with Extended Openai Conversation. But I would love to use some model locally - your Home LLM software looks very promising and could be the first that works locally - I have a feelling that there is a very little missing to make it work with any local model. Do you have any advice?

rvsh2 commented 5 months ago

My another question is why if the JSON object is correct and it has "to_say", "service" and "target_device". The homeassistant don't recognize it and the whole thing goes to TTS and speaking model is reading the whole JSON. Maybe there should be some safeguard for this? The second problem is that the model sometimes wrongly choose "target_device" but I think this could be finetuned in prompt.

acon96 commented 5 months ago

My another question is why if the JSON object is correct and it has "to_say", "service" and "target_device". The homeassistant don't recognize it and the whole thing goes to TTS and speaking model is reading the whole JSON. Maybe there should be some safeguard for this?

If the model is producing output that is not matched by the service call regex, then you should alter the regex to capture the output or use a backend that supports GBNF grammar or JSON mode to force the model to produce output that is recognized.

Closing as this is resolved

acon96 / home-llm

The answer is in json format but not working #113