Closed Monviech closed 1 week ago
I host a llama3 model on prem with tgi and use the following in my .env.local to use chatui with it.
"endpoints":[
{
"type":"tgi",
"url":"http://llama3-70b-instruct-api"
}
]
Maybe setting the type to tgi instead of openai helps?
Hello @hsayniaj79 , thank you for your answer.
I thought I had to use the OpenAI API between chat-ui and tgi.
I only find documentation of that API: https://github.com/oobabooga/text-generation-webui/wiki/12-%E2%80%90-OpenAI-API#examples
Has there been a different API in an older version of tgi that you use and that doesn't exist now anymore?
Trying the tgi
configuration just results in this 405 error:
Using a model URL is deprecated, please use the `endpointUrl` parameter instead
Using a model URL is deprecated, please use the `endpointUrl` parameter instead
Error: Server response contains error: 405
at streamingRequest (file:///opt/chat-ui/node_modules/@huggingface/inference/dist/index.js:334:11)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Proxy.textGenerationStream (file:///opt/chat-ui/node_modules/@huggingface/inference/dist/index.js:705:3)
at async Module.generate (/opt/chat-ui/src/lib/server/textGeneration/generate.ts:8:20)
at async textGenerationWithoutTitle (/opt/chat-ui/src/lib/server/textGeneration/index.ts:56:3)
[11:10:10.326] ERROR (2416): Server response contains error: 405
err: {
"type": "Error",
"message": "Server response contains error: 405",
"stack":
Error: Server response contains error: 405
at streamingRequest (file:///opt/chat-ui/node_modules/@huggingface/inference/dist/index.js:334:11)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Proxy.textGenerationStream (file:///opt/chat-ui/node_modules/@huggingface/inference/dist/index.js:705:3)
at async Module.generateFromDefaultEndpoint (/opt/chat-ui/src/lib/server/generateFromDefaultEndpoint.ts:12:20)
at async generateTitle (/opt/chat-ui/src/lib/server/textGeneration/title.ts:54:10)
at async Module.generateTitleForConversation (/opt/chat-ui/src/lib/server/textGeneration/title.ts:17:19)
}
Hi @Monviech,
My bad, I confused hugingface's text-generation-inference (tgi) with oobabooga's text-generation-webui. For the latter, I think the openai type is correct. Could it be that the missing "chatPromptTemplate" in the models in .env.local is the issue?
@hsayniaj79 Oh I haven't noticed that either, I guess I should have written oobabooga
for less confusion. I have tried to specify the chatPromptTemplate
with a few different ones to see if it changes anything, but it doesn't seem to change anything.
Also, I am not dead set on using oobabooga, it was just my first choice because I have used the stable diffusion webui extensively and it looked just like it. I wanted to have a chatgpt style chat though, so I came to the huggingface webui.
If the experience is better when using a combination of: https://github.com/huggingface/text-generation-inference https://github.com/huggingface/chat-ui
I will stop the troubleshooting here and try to use tgi
as backend instead.
@Monviech I can only share my personal experience. We've been using tgi+chatui on our kubernetes cluster in a research institute for a while. So far, it's been pretty straightforward and painless.
@hsayniaj79 Thank you. I have deployed tgi+chatui instead and things instantly worked. I'm happy you helped me. :)
Hello,
I'm trying my best to get the huggingface
chat-ui
working with the API endpoint oftext-generation-webui
.I would be really happy if I could get a hint what I am doing wrong.
Here is a reverse proxied test instance: https://chat-ui-test.pischem.com/
I can't get my prompt that I input into the chat-ui to pass to the text-generation-webui. Every prompt will be ignored and a random answer is returned.
Here is the command I start
text-generation-webui
:Here is my current
.local.env
of thechat-ui
and the command I run it with:Here are the logs what happen when I write a prompt:
chatui
:text-generation-webui
:I have inputted
test
as prompt in the chat-ui, and the first answer is alwaysYou are a helpful assistant
. Each time I input another prompt, the answer will become random, as if the AI asks the question and answers it. I have logged the random conversation:The only thing that works is setting a system prompt, that will then be used and an answer to that system prompt will be generated. But any user prompt will get ignored and a random answer is given.
Here is a log example when the system prompt is set:
I want to know what I am missing, what makes the API endpoint accept my user prompt?
Environment:
Ubuntu 22.04.4 LTS nodejs v22.3.0 npm 10.8.1 chat-ui@0.9.1 dev text-generation-webui@abe5ddc8833206381c43b002e95788d4cca0893a