anon998 / simple-proxy-for-tavern

GNU Affero General Public License v3.0
112 stars 6 forks source link

Vicuna cocktail inserts <s> into generated text. #5

Open Pokora22 opened 1 year ago

Pokora22 commented 1 year ago

I loaded up a Vicuna cocktail after seeing the new prompt format in the changelog, but every generation the model is intent on injecting <s> at the end (or even ending sentences). I believe it's an end of sentence token, but it doesn't seem to be working properly with my setup:

Koboldcpp v1.20 (run with --usemirostat) + SillyTavern 1.4.9 + Vicuna cocktail from here (https://huggingface.co/reeducator/vicuna-13b-cocktail) - ggml version + proxy with storywriter preset and vicuna-cocktail prompt format.

Using same model and tavern, but connecting directly to Kobold API results in no <s> tokens in text, so I believe it has to do with the prompt being formatted using <s> in it. Could the model itself be incompatible with this format? If so, a note on which cocktail it works with would be appreciated.

For now, I just set afterAssistant = "";

anon998 commented 1 year ago

The GPU version with Ooba works fine, it seems to only be a problem with Koboldcpp. As a workaround, I keep </s> in the GPU version but I change it to \n when Koboldcpp is used. I also updated it to make more things customizable with the variables in that file.