Failed to communicate with the API

dadcoachengineer commented 3 months ago

Running text-generation-webui on an external Jetson host. Works fine both in the web chat and I can even curl to it from the Home Assistant CLI and get a reply.

Screenshot 2024-03-11 at 9 31 29 AM Screenshot 2024-03-11 at 10 10 26 AM

acon96 commented 3 months ago

Are there any error messages produced by text-generation-webui? Can you also include any DEBUG logs from the Home Assistant logs?

rlust commented 3 months ago

How do you get to the place to set Port and ip address of API

Chreece commented 3 months ago

I have also the same problem. The text-generation-webui runs on a separate docker container from my home assistant. I setup the LLama integration and when I try to chat with the bot I see in my t-g-w logs that the model is loaded but I got a read timeout on LLama:

Sorry, there was a problem talking to the backend: HTTPConnectionPool(host='192.168.178.57', port=5000): Read timed out. (read timeout=90.0)

I have tried: with or without AI as Character (without i'm getting error on t-g-w), with or without Use chat completions endpoint (activated results direct to an error: Failed to communicate with the API! 500 Server Error: Internal Server Error for url: http://192.168.178.57:5000/v1/chat/completions which is tested and works), tried also other promt formats...

Have also the same model with @dadcoachengineer

BramNH commented 3 months ago

I have also the same problem. The text-generation-webui runs on a separate docker container from my home assistant. I setup the LLama integration and when I try to chat with the bot I see in my t-g-w logs that the model is loaded but I got a read timeout on LLama:
Sorry, there was a problem talking to the backend: HTTPConnectionPool(host='192.168.178.57', port=5000): Read timed out. (read timeout=90.0)

Do you have the port right? My docker container for localAI defaults to 8080

Chreece commented 3 months ago

Do you have the port right? My docker container for localAI defaults to 8080

If I didn't the model would not be loaded... (it loads after setting up the LLama integration)

text-generation-webui-1  | 19:23:03-409986 INFO     LOADER: "llama.cpp"
text-generation-webui-1  | 19:23:03-410923 INFO     TRUNCATION LENGTH: 8960
text-generation-webui-1  | 19:23:03-411741 INFO     INSTRUCTION TEMPLATE: "Custom (obtained from model
text-generation-webui-1  |                          metadata)"
text-generation-webui-1  | 19:23:03-412622 INFO     Loaded the model in 0.82 seconds.

Chreece commented 3 months ago

Tried the LLama.cpp direct from the integration. I also got no response for the 3b and 1b models (both tested). Something curious is that the integration creates 2 identical services with the same name and shows the error: Invalid flow specified. The same results I got with the default TheBloke model that comes up by setting the integration.

I'm posting this here because it seems that this bug is not directly related with text-generation-webui.

My system has a i5-9400T CPU with avx, avx2 capabilities, no Nvidia/AMD gpu

Sorry for the spam, trying to help identify the bug

acon96 commented 3 months ago

Tried the LLama.cpp direct from the integration. I also got no response for the 3b and 1b models (both tested). Something curious is that the integration creates 2 identical services with the same name and shows the error: Invalid flow specified. The same results I got with the default TheBloke model that comes up by setting the integration.

I actually just identified the cause of this issue yesterday. I'm going to push a fix for that in the next day or so.

For the connection issues: I'm going to add some better debug logging to help identify what is happening. I'll reply back here when that is done.

dadcoachengineer commented 3 months ago

Thank you @acon96

I will update when I see the release published. I think my issues may also have something to do with context window due to the number of entities I have exposed to assist.

Will drop logs in here ASAP.

Chreece commented 3 months ago

I think my issues may also have something to do with context window due to the number of entities I have exposed to assist.

I also thought about that and increased the n_ctx. I'm not getting any errors about the context as I had before increasing it but now I may have more issues (RAM filling up...) It's a journey for all of us and we are traveling extremely fast...

acon96 commented 3 months ago

@dadcoachengineer I have pushed v0.2.9 that has quite a few bug fixes as well as better debug logging/error handling. Please let me know if the error persists and what the debug logs look like.

Chreece commented 3 months ago

Just tried the update. Had no longer double services installed with LLaMa.cpp (hugging face). Unfortunately I get the following error: Sorry, there was a problem talking to the backend:'LocalLLaMAAgent' object has no attribute 'grammar' For the text-generation-webui seems that the problem still exists for me (internal error for completions checked and read timeout without completion)

Chreece commented 3 months ago

Some more context about the timeout error for the text-generation-webui: After sending the prompt from ha, tried to communicate with the model through the web interface from t-g-w. Suddenly the response is extremely slow (took some minutes ...) So I'm assuming the amount of info send from ha is too big for it to handle thus the time out error (90 sec.)

Chreece commented 3 months ago

@dadcoachengineer try exposing only a few entities (maybe 10) to voice assistant, that did the trick for me... I would be happy to know how did you install text-generation-webui on Jetson

dadcoachengineer commented 3 months ago

@Chreece - I am going to find some this weekend to dig back in.

I am using Dusty's docker image. Works well.

dustynv/text-generation-webui:1.7-r35.4.1

NVIDIA_VISIBLE_DEVICES=all NVIDIA_DRIVER_CAPABILITIES=all

acon96 commented 1 month ago

closing b/c of inactivity

acon96 / home-llm

Failed to communicate with the API #87