LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
5.35k stars 364 forks source link

Whisper errors on KoboldAI Lite frontend (SillyTavern with same KoboldCpp Whisper works well) #1189

Closed Denplay195 closed 3 weeks ago

Denplay195 commented 1 month ago

When using Whisper model (ggml-tiny-en.bin) in KoboldAI Lite frontend (SillyTavern works perfectly well with the same Whisper loaded through KoboldCpp), a window shows 'Error while submitting prompt: Error: Error occurred while SSE streaming: Service Unavailable' and doesn't start processing the text appeared after recognition

If I try again with token streaming changed to 'Poll' or even 'Off' it says 'Error occurred during text generation: {"detail":{"msg":"Server is busy; please try again later.","type":"service_unavailable"}}'

image

Same with every text and whisper models I've tried, no matter which size (From 3b to 13b) Also switching the voice input modes doesn't help

NoAVX2 Vulkan backend

Denplay195 commented 1 month ago

It seems like the voice input in KoboldAI Lite frontend is showing to be busy though it was already finished, the generation starts when I close the error window and click on "Busy" submit mic button

LostRuins commented 1 month ago

Thanks for reporting this. This happens because the request comes too soon. An easy fix is to enable multiuser mode (checkbox or using --multiuser). It will be fixed in the next version.

LostRuins commented 3 weeks ago

Hi, should be fixed in the latest build

Denplay195 commented 3 weeks ago

image Indeed it is! (I've tested without multiuser), closing the issue 👍