Request cancellation via OpenAI API does not seem to work

LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.

https://github.com/lostruins/koboldcpp

GNU Affero General Public License v3.0

4.97k stars 349 forks source link

Request cancellation via OpenAI API does not seem to work #745

Open KizzyCode opened 6 months ago

KizzyCode commented 6 months ago

If I use koboldcpp's OpenAI API via SillyTavern or LibreChat, and then cancel the request via the stop buttons, more often than not, koboldcpp happily keeps generating new tokens until either the token limit is reached or it comes to a conclusion.

I'm not 100% sure if that's a problem with koboldcpp; but since both frontends seem to work with other backends and fail with koboldcpp, I'd guess it is the outlier here.

Steps to reproduce:

Setup SillyTavern or LibreChat
Connect it to koboldcpp via the OpenAI v1 API
Try to abort a request

LostRuins commented 6 months ago

Are you using streaming?

KizzyCode commented 6 months ago

Positive, I'm using streaming. Also, in like 1 out of 4 cases it seems to abort correctly, so I'm a bit puzzled... This is my koboldcpp config: miqu-1-70b.q5_K_M.kcpps.zip. I don't know of a good way to export SillyTavern's or LibreChat's config, but both are pretty vanilla.

LostRuins commented 6 months ago

What version of sillytavern and koboldcpp are you using? Did you select the "koboldCpp" option under text-completions endpoints?

KizzyCode commented 6 months ago

koboldcpp: KoboldCPP-v1.60.1.yr0-ROCm
SillyTavern: SillyTavern 1.11.5 'release' (ad36b3b6)

Api-type is Text Completion/KoboldCpp: Screenshot 2024-03-14 at 15 15 13

LostRuins commented 6 months ago

Hmm that is odd then. I'm not very sure, but i'll look into it. Did you try see if it mainline koboldcpp works compared to the rocm fork?

KizzyCode commented 6 months ago

Nope, I didn't; but I can try later, just to be sure it's not related to the rocm patches. Will give an update :)

KizzyCode commented 6 months ago

Happens too with the current vanilla release.