Newlines in Llama 3 End Sequence breaking it

LostRuins / koboldcpp

A simple one-file way to run various GGML and GGUF models with KoboldAI's UI

https://github.com/lostruins/koboldcpp

GNU Affero General Public License v3.0

4.36k stars 312 forks source link

Open wereretot opened 2 months ago

wereretot commented 2 months ago

Having newlines in the end sequence causes the model to continue as it doesn't put newlines down most of the time, so usually it starts moralizing.

wereretot commented 2 months ago

This is happening with 70B IQ_2_XS

wereretot commented 2 months ago

Removing the newlines causes the model to stop as expected.

LostRuins commented 2 months ago

I will remove the newlines from the llama 3 format in the next version. Seems to be generally negative.

wereretot commented 2 months ago

@LostRuins Yeah, even though I'm pretty sure the official format says to do it.

LostRuins commented 2 months ago

Hi, Should be fixed in the latest version, turns out this was due to a bad tokenizer merge. Try it again with a freshly reconverted GGUF.