Open Pokora22 opened 1 year ago
The GPU version with Ooba works fine, it seems to only be a problem with Koboldcpp. As a workaround, I keep </s> in the GPU version but I change it to \n when Koboldcpp is used. I also updated it to make more things customizable with the variables in that file.
I loaded up a Vicuna cocktail after seeing the new prompt format in the changelog, but every generation the model is intent on injecting
<s>
at the end (or even ending sentences). I believe it's an end of sentence token, but it doesn't seem to be working properly with my setup:Koboldcpp v1.20 (run with --usemirostat) + SillyTavern 1.4.9 + Vicuna cocktail from here (https://huggingface.co/reeducator/vicuna-13b-cocktail) - ggml version + proxy with storywriter preset and vicuna-cocktail prompt format.
Using same model and tavern, but connecting directly to Kobold API results in no
<s>
tokens in text, so I believe it has to do with the prompt being formatted using<s>
in it. Could the model itself be incompatible with this format? If so, a note on which cocktail it works with would be appreciated.For now, I just set afterAssistant = "";