Closed mayaeary closed 1 month ago
logit_bias
instead which already exists, and does not need to be added.banned_strings
as you suggest for ST.Does it still crash after I've merged your latest PR?
See https://github.com/LostRuins/koboldcpp/commit/d75cbd671d0c241fccde1eed26e598eab6519dd9
Also, is banned_token_ids
a sillytavern feature? I can alias that into logit_bias
= -1000
if it is used. It will have the same effect.
Does it still crash after I've merged your latest PR?
With that fix it works well now.
going to add quite a lot of overhead
Where is that overhead is going from? As far as I understand the source, the tokens transferred to cpp part once per request, so 32 or 64 is not much difference. And in the actual applying of bans, it's anyway iterating to the banned_token_ids.size()
. This limit is applied only when you transfer genparams from python to the cpp. It's much more frustrating, when you add token ban/logit bias and desperately try to figure out why it's not working.
Also, is banned_token_ids a sillytavern feature?
No, sillytavern uses custom_token_bans
as string of token ids: 1,2,3,444,42,69
.
looks good enough to merge for now, might add some sanity checks for bad formatting later.
What's changed:
banned_token_ids
to genparams to be able to ban tokens by idUnfortunately, it seems that banned_phrases is broken now, it causes an access violaton (model Rocinante-12B-v1.Q8_0), so I couldn't check if that work.