anon998 / simple-proxy-for-tavern

GNU Affero General Public License v3.0
110 stars 6 forks source link

Low quality output on vicuna-13b-cocktail-v1-4bit-128g/bot not adhering to defs #9

Open hpnyaggerman opened 1 year ago

hpnyaggerman commented 1 year ago

Good day. My issue is as follows: the output quality the setup described in the README produces is much lower than what can be observed on screenshots, such as https://rentry.org/llama-examples. My setup is as follows: I am using SillyTavern to connect to simple-proxy-for-tavern, which then connects to a server on my local network that is running vicuna-13b-cocktail-v1-4bit-128g. However, the output quality of my setup is a far cry from what you can see in the screenshots, which is a mystery that haunts me at night. More specifically, the bot seems to simply refuse to adhere to the definitions, specifically to the format in which the narration is done in the greeting message and the message examples. Here's an example: I am using a modified Holodeck character card; here's the dialogue (https://imgur.com/a/khvM6mv); here are the defs (https://imgur.com/a/q5E0fhI). It does not adhere to the format of narration defined in the definitions, no matter how many times I try. Here is an example of interactions with a card that was in the screenshots of the example  (https://imgur.com/a/dYyUDwq). The same thing is going on here, for one reason or another.

anon998 commented 1 year ago

The screenshots are likely using something like SuperCOT 30B or Alpasta I think. I don't really have a good prompt for the Vicuna-style models. If you already changed the prompt format to "prompt-formats/vicuna-cocktail.mjs," try opening that file and change "addFinalInstruction" to false. It might stick better to the formatting but it will likely give short answers unless greeting/example are long.

You can try these other models that seem to work better with the Alpaca/verbose prompt format that was enabled by default: https://huggingface.co/digitous/13B-HyperMantis_GPTQ_4bit-128g https://huggingface.co/ausboss/llama-13b-supercot-4bit-128g https://huggingface.co/4bit/WizardLM-13B-Uncensored-4bit-128g https://huggingface.co/elinas/chronos-13b-4bit

hpnyaggerman commented 1 year ago

It would be nice if anons here could share their models here, as well as how well it performs with this proxy.