LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
5.35k stars 364 forks source link

Fix antislop compatibility with ST custom stopping strings #1176

Closed Nexesenex closed 1 month ago

Nexesenex commented 1 month ago

And an example of proper format for the custom stopping strings imput box in ST :

["[EDIT]", "[EDIT]", "OOM:", "Note:", "Note:", "User note:",}

Basically, the JSON serialized array that ST recommends.

LostRuins commented 1 month ago

No, I think you have it confused. Stop sequences are a separate thing, which halts output when encountered. Banned sequences aka anti-slop will rewind and regenerate a different token when encountered instead of terminating, they are a different thing.

Also as originally mentioned by @mayaeary the banned_strings field names in ST should be correct, unless something new has changed. @Cohee1207 can confirm.

Cohee1207 commented 1 month ago

banned_strings is a correct name.

https://github.com/SillyTavern/SillyTavern/blob/ba6f7b7a98cf7a5eaf4f0e81da9779a9a668ced4/public/scripts/textgen-settings.js#L1156

Nexesenex commented 1 month ago

@LostRuins Indeed, I just retested. Your experimental indeed works as intended without that change, while my fork needs that change to work as intended. My bad to not have tested on experimental first! Sorry!

Nexesenex commented 1 month ago

I actually retested. The reason it worked here is.. that I made the modif on experimental too before submitting PR, to be sure it would work.. Undoing the modifs returned to the status quo ante : stop sequence :

Without the modif :

Generating (22 / 512 tokens) [( down 100.00%)]

(Stop sequence triggered: shiver down)
llama_perf_context_print:        load time =    3708.51 ms
llama_perf_context_print: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_perf_context_print:        eval time =       0.00 ms /    22 runs   (    0.00 ms per token,      inf tokens per second)
llama_perf_context_print:       total time =    2779.26 ms /    23 tokens

CtxLimit:1779/65536, Amt:22/512, Init:0.01s, Process:0.55s (554.0ms/T = 1.81T/s), Generate:2.22s (100.9ms/T = 9.91T/s), Total:2.77s (7.93T/s)
Output:  oh so suddenly you quote someone to test me... very well, my friend>
Adam: A shiver down

FileFormat: 6, Tokenizing: <Adam's thoughts:  oh so suddenly you quote someone to test me... very well, my friend>
A
Tokens Counted: 24
127.0.0.1 - - [20/Oct/2024 17:46:14] "POST /api/extra/tokencount HTTP/1.1" 200 -

With the modif:


(Banned Phrase Detected: shiver down - Add ID 559 to banlist at index 100, and rewinding 3 tokens)
Generating (11 / 36 tokens) [( cold 8.10%) (  12.25%) ( feeling 12.05%) ( phrase 11.97%) (... 10.39%)]
Generating (12 / 36 tokens) [( sh 49.54%) ( feeling 50.46%)]
Generating (13 / 36 tokens) [(iver 100.00%)]
DRY penalties [( down 0.80)]
Generating (14 / 36 tokens) [( ran 100.00%)]
Generating (15 / 36 tokens) [( down 100.00%)]
Generating (16 / 36 tokens) [( her 100.00%)]
DRY penalties [( spine 0.80)]
Generating (17 / 36 tokens) [( spine 100.00%)]
DRY penalties [(. 1.40)]
Generating (18 / 36 tokens) [(<|eot_id|> 20.50%) ( or 16.25%) (, 16.10%) ( is 14.34%)]

(EOS token triggered! ID:128009)
llama_perf_context_print:        load time =    3706.58 ms
llama_perf_context_print: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_perf_context_print:        eval time =       0.00 ms /    18 runs   (    0.00 ms per token,      inf tokens per second)
llama_perf_context_print:       total time =    2207.20 ms /    19 tokens

CtxLimit:108/65536, Amt:18/36, Init:0.00s, Process:0.55s (545.0ms/T = 1.83T/s), Generate:1.66s (92.2ms/T = 10.85T/s), Total:2.20s (8.17T/s)
Output:  recognizing an idiomatic phrase > a cold shiver ran down her spine

FileFormat: 6, Tokenizing: <Adam's thoughts:  recognizing an idiomatic phrase > a cold shiver ran down her sp
Tokens Counted: 21
127.0.0.1 - - [20/Oct/2024 17:52:47] "POST /api/extra/tokencount HTTP/1.1" 200 -

Could you actually check if my PR is actually invalid, @concedo , please?

LostRuins commented 1 month ago

What are you trying to do? Cohee has already confirmed that the banned_strings is a correct identifier to be used with the phrase banning feature in ST. As far as I can see, that's the only change in this PR? Are you editing it in the right section in ST?

Nexesenex commented 1 month ago

Ah, this was a matter of syntax.

I used : ["string", "string"] like for the custom stopping string.

When it was meant to be : "string" "string" In the banned tokens.

I got confused indeed. Now everything works as intended, without edit of the experimental branch.