HA assist / Ollama times out after previously working fine. Ollama works fine on the host machine. No configuration changes.

kshoichi commented 1 month ago

Describe the bug
I'm using Ollama with llama3:8b. It worked great for several queries through Home Assistant assist. Then it stopped mid-chat with no changes to the configuration. Once in a while it will respond in assist after a very long time.

I confirmed that ollama is functioning and responding to queries very rapidly on the host (a p520 with 64GB RAM and a 3060 with 12GB VRAM). Rebooted both machines. Also uninstalled this integration and deleted the voice assistant in home assistant. Reinstalled it and everything went smoothly. The integration found the ollama server, but the problem remains. Responses time out with "The generation request timed out! Please check your connection settings, increase the timeout in settings, or decrease the number of exposed entities." I have around 60 entities exposed.

I confirmed that HA Assist works fine using the Home Assistant conversation agent. Also reinstalled the nVidia drivers on the host and deleted / reinstalled the llama3:8b model on the host. Confirmed that llama is running in VRAM.

Expected behavior
Assist should be using ollama and responding rapidly with this hardware setup, not timing out.

Logs
If applicable, please upload any error or debug logs output by Home Assistant.

Logger: homeassistant
Source: custom_components/llama_conversation/__init__.py:41
integration: LLaMA Conversation ([documentation](https://github.com/acon96/home-llm))
First occurred: May 23, 2024 at 11:12:33 PM (2 occurrences)
Last logged: 9:32:23 AM

Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/config/custom_components/llama_conversation/__init__.py", line 41, in update_listener
    agent: LLaMAAgent = await ha_conversation._get_agent_manager(hass).async_get_agent(entry.entry_id)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'homeassistant.components.conversation' has no attribute '_get_agent_manager'. Did you mean: 'get_agent_manager'?

Core 2024.5.4
Supervisor 2024.05.2
Operating System 12.3
Frontend 20240501

acon96 commented 1 month ago

Looks like an incompatibility with the new HA version (2024.5.4)

benbender commented 1 month ago

Yeah, there seem to be some changes in that area recently:

kshoichi commented 1 month ago

It seems to be working fine now. But I made too many changes to identify the solution. I am on 2024.5.5 now. I also re-did the machine that runs Ollama. It was running Ubuntu 24.04 and replaced that with 22.04. There may have been some nvida driver issues causing Llama to drop into RAM instead of VRAM.

acon96 commented 1 month ago

Yeah, there seem to be some changes in that area recently:
* https://developers.home-assistant.io/docs/core/llm/

* https://developers.home-assistant.io/blog/2024/05/20/llm-api

This change rewrites basically the entire interface for how this integration needs to interact with Home Assistant. It might take me a while to do so fair warning it might be a bit before this is fixed.

acon96 commented 3 weeks ago

This should be solved now in v0.3

acon96 / home-llm

HA assist / Ollama times out after previously working fine. Ollama works fine on the host machine. No configuration changes. #149