Vicuna cocktail inserts <s> into generated text.

I loaded up a Vicuna cocktail after seeing the new prompt format in the changelog, but every generation the model is intent on injecting <s> at the end (or even ending sentences). I believe it's an end of sentence token, but it doesn't seem to be working properly with my setup:

Koboldcpp v1.20 (run with --usemirostat) + SillyTavern 1.4.9 + Vicuna cocktail from here (https://huggingface.co/reeducator/vicuna-13b-cocktail) - ggml version + proxy with storywriter preset and vicuna-cocktail prompt format.

Using same model and tavern, but connecting directly to Kobold API results in no <s> tokens in text, so I believe it has to do with the prompt being formatted using <s> in it. Could the model itself be incompatible with this format? If so, a note on which cocktail it works with would be appreciated.

For now, I just set afterAssistant = "";

anon998 / simple-proxy-for-tavern

Vicuna cocktail inserts <s> into generated text. #5