Token length exceeded - Githubissues

The integration returns the following message in the HA Assist chat: Something went wrong: token length(150) exceeded. Increase maximum token to avoid the issue. I've tried increasing the token length, but it keeps exceeding the token limit.
I am running llama-cpp-python in a Docker container with the Functionary model and offload all layers on a GTX 1080.
2024-04-03 09:09:19.101 INFO (MainThread) [custom_components.extended_openai_conversation] Prompt for llama: [{'role': 'system', 'content': "I want you to act as smart home manager of Home Assistant.\nI will provide information of smart home along with a question, you will truthfully make correction or answer using information provided in one sentence in everyday language.\n\nCurrent Time: 2024-04-03 11:09:19.093188+02:00\n\nAvailable Devices:\n```csv\nentity_id,name,state,aliases\nscript.dim_dashboard_bedroom,Dim dashboard Slaapkamer,off,\nlight.lights_hallway,Lampen overloop,off,Overloop lampen\nlight.verlichting_slaapkamer,Verlichting slaapkamer,off,Slaapkamer lampen\nlight.openrgb_pc,PC,on,PC LEDS/Lampen van PC/lampen PC/PC lampen/lampen computer\nlight.lampen_woonkamer,Lampen woonkamer,off,\nlight.lights_playing_room,Lampen speelkamer,off,\nlight.shelly1l_3c6105e3fe88,Lichtschakelaar slaapkamer Bram,off,Hanglamp/Hang lamp\ntodo.shopping_list,Shopping List,0,\nsensor.p1_actual_power_consumption,P1 Actual Power Consumption,0.775,Stroomverbruik\nsensor.p1_actual_return_delivery,P1 Actual Return Delivery,0.0,Stroom teruglevering\ndevice_tracker.bramoneplus,BramOnePlus,home,Bram\ndevice_tracker.sabineoneplus,SabineOnePlus,Saxion,Sabine\ndevice_tracker.gerlindeoneplus,GerlindeOnePlus,Bloemendal,\nmedia_player.sony_kd_43xg8399,TV,off,\nlight.wled_2,Nachtkast Ledstrip,off,Nachtkast\nclimate.leaving_water_offset, Leaving Water Offset,heat,\nclimate.room_temperature, Room Temperature,heat,\nsensor.climatecontrol_leaving_water_temperature,Daikin Altherma 3 ClimateControl Leaving Water Temperature,42,\nsensor.climatecontrol_outdoor_temperature,Daikin Altherma 3 ClimateControl Outdoor Temperature,11,\nsensor.climatecontrol_room_temperature,Daikin Altherma 3 ClimateControl Room Temperature,21.5,\nsensor.domestichotwatertank_tank_temperature,Daikin Altherma 3 DomesticHotWaterTank Tank Temperature,55,\nwater_heater.daikin_onecta_900c30b8_bcd6_4c58_8b8a_266d76cbfbbb,daikin onecta 900c30b8 bcd6 4c58 8b8a 266d76cbfbbb,heat_pump,\nswitch.bram_pc,PC Bram,off,Game PC/Computer/PC van Bram/Bram PC\nswitch.sonoff_100104b0e5,Plafondlamp,off,\nmedia_player.kamer,Sonos speaker,playing,Woonkamer/Woonkamer speaker\nlight.tz3000_nbnmw9nc_ts0501a_light,Kastlamp,off,\nlight.awox_tlsr82xx_light,AwoX TLSR82xx Licht,unavailable,hanglamp/Hang lamp/Plafond lamp\nvacuum.gerrie,Gerrie,docked,Stofzuiger/Gerrie\nsensor.amazfit_band_7_bram_steps_daily,Bram Daily Steps,391,\nsensor.amazfit_band_7_bram_pai,Bram PAI,0.594482421875,\n```\n\nThe current state of devices is provided in available devices.\nUse execute_services function only for requested action, not for current states.\nDo not execute service without user's confirmation.\nDo not restate or appreciate what user says, rather make a quick inquiry."}, {'role': 'user', 'content': 'what time is it?'}]
2024-04-03 09:09:30.026 INFO (MainThread) [custom_components.extended_openai_conversation] Response {'id': 'chatcmpl-eb40afb3-6e9e-4208-8fdf-2ebc6161f95b', 'choices': [{'finish_reason': 'length', 'index': 0, 'message': {'content': ' all\n<|content|>It is currently 11:09 on April 3, 2024.<|stop|> assistant\n<|from|> all\n<|content|>The current time is 11:09 on April 3, 2024.<|stop|> assistant\n<|recipient|> all\n<|content|>The current time is 11:09 on April 3, 2024.<|stop|> assistant\n<|content|> all\n<|content|>The current time is 11:09 on April 3, 2024.<|stop|> assistant\n<|recipient|> all\n<|content|>The current time is 11:09 on April 3, 2024.<|stop|> assistant\n<|recipient|> all\n<|content|>The current time is 11:', 'role': 'assistant'}}], 'created': 1712135359, 'model': 'llama', 'object': 'chat.completion', 'usage': {'completion_tokens': 150, 'prompt_tokens': 1090, 'total_tokens': 1240}}
2024-04-03 09:09:30.026 ERROR (MainThread) [custom_components.extended_openai_conversation] token length(`150`) exceeded. Increase maximum token to avoid the issue.
Traceback (most recent call last):
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 196, in async_process
    query_response = await self.query(user_input, messages, exposed_entities, 0)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 388, in query
    raise TokenLengthExceededError(response.usage.completion_tokens)
custom_components.extended_openai_conversation.exceptions.TokenLengthExceededError: token length(`150`) exceeded. Increase maximum token to avoid the issue.
jekalmin / extended_openai_conversation

Token length exceeded #189