Open cody151 opened 1 month ago
Hi :wave: I'd need to know the following in order to help you out
text-generation-web-ui
? MODELS
var for chat-ui ? (Making sure to hide any secrets)Seems to me like a prompt template issue or a stop token issue but can't be sure until I see the above.
Hi @nsarrazin many thanks for getting back to me,
What settings are you using in text-generation-web-ui ?
What model are you running ?
What is the content of your MODELS var for chat-ui ? (Making sure to hide any secrets)
Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf$:
loader: llama.cpp
cpu: false
cache_8bit: false
cache_4bit: false
threads: 0
threads_batch: 0
n_batch: 512
no_mmap: false
mlock: false
no_mul_mat_q: false
n_gpu_layers: 33
tensor_split: ''
n_ctx: 33792
compress_pos_emb: 1
rope_freq_base: 500000
numa: false
no_offload_kqv: false
row_split: false
tensorcores: false
flash_attn: false
streaming_llm: true
attention_sink_size: 5
Meta-Llama-3.1-8B-Instruct-abliterated.i1-Q5_K_M.gguf$:
loader: llama.cpp
cpu: false
cache_8bit: false
cache_4bit: false
threads: 0
threads_batch: 0
n_batch: 512
no_mmap: false
mlock: false
no_mul_mat_q: false
n_gpu_layers: 33
tensor_split: ''
n_ctx: 8192
compress_pos_emb: 1
rope_freq_base: 500000
numa: false
no_offload_kqv: false
row_split: false
tensorcores: false
flash_attn: false
streaming_llm: true
attention_sink_size: 5
Can you also show the MODELS
used in chat-ui
not the config-user.yaml
for text-generation-webui ? :eyes:
Can you also show the
MODELS
used inchat-ui
not theconfig-user.yaml
for text-generation-webui ? :eyes:
Sure please see the attached screenshots
Chat-UI
text-generation-webui
@nsarrazin this is the contents of my Chat-UI ".env.local" file
MONGODB_URL=mongodb://[mongoIP]:12345
#HF_TOKEN=<your access token>
PUBLIC_APP_NAME=Chat-UI
PUBLIC_APP_ASSETS=chatui
PUBLIC_APP_COLOR=blue
PUBLIC_APP_DESCRIPTION="AI"
PUBLIC_APP_DATA_SHARING=
PUBLIC_APP_DISCLAIMER=
MODELS=`[
{
"name": "text-generation-webui",
"id": "text-generation-webui",
"parameters": {
"temperature": 1,
"top_p": 0.95,
"max_new_tokens": 1024,
"stop": []
},
"endpoints": [{
"type" : "openai",
"baseURL": "http://[text-gen-ip]:5000/v1",
"extraBody": {
"repetition_penalty": 1.2,
"top_k": 50,
}
}]
}
]`
Anyone have any ideas? Chat-UI is completely unusable so I'm confused how people have (supposedly) got this working?
Hard to tell what the issue is, probably a config error somewhere. I know people have successfully used llama.cpp OpenAI server as a backend so could be worth trying that to isolate where the issue is coming from
Hard to tell what the issue is, probably a config error somewhere. I know people have successfully used llama.cpp OpenAI server as a backend so could be worth trying that to isolate where the issue is coming from
Thanks but llama.cpp is abysmally slow compared to Ooogabooga text-gen.
@nsarrazin could you change the tag to "bugs" as this is clearly broken in the current implementation. Not sure why "Support" tag was added..
Oogabooga text-generation-web-ui engine used for inference (prompts directly input into the oogabooga ui produce normal results but chat-ui is doing something weird as below), Mongodb setup
Prompt: bake a cake
Assistant:
Title for the chat: Tax refund help
JSON GET RESPONSE DATA:
Prompt 2: make a cake
Assistant:
Title for the chat: COVID vaccine approval
JSON GET RESPONSE DATA: