Only able to generate gibberish (Wizard-Vicuna-13B-Uncensored-GPTQ)

JomSpoons commented 3 months ago

Describe the bug

I have been trying to use text-generation-webui to generate text using TheBloke/Wizard-Vicuna-13B-Uncensored-GPTQ:latest but all I've managed to get is nonsensical gibberish responses. Two screenshots and the log are provided below.

I found another person having the exact same problem several months ago (#4509), but the issue was automatically closed and I'm unable to reopen it myself. I wish I had more to say regarding this but I'm honestly at a total loss. Other models seem to work fine, including the 7B version of Wizard Vicuna. The logs don't give any indication as to why the generation is not working, so I can't make much sense of it. If anyone could reopen the previous issue I would be very grateful. Thank you.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Load Wizard Vicuna 13B GPTQ Load chat Assistant or Example. Type "How are you today?"

Screenshot

Logs

[log.txt](https://github.com/user-attachments/files/16571648/log.txt)

System Info

Windows 10, EVGA, NVIDIA GeForce RTX 3060

YakuzaSuske commented 3 months ago

From my own experience older Quantizations are making gibberish since new updates have been made to stuff like llama cpp, GPTQ and such. See if you can find an Exllama2 quant of that models. Usually that doesn't give me gibberish on older models. If that doesn't work, Your last bet would be to download the full fp16 model and load it using 4bits in the UI using transformers.

Edit: If you are looking for a much newer and better 12B model that can cook up stories and such and is technically uncensored if you just say it is or do some minor changes, then try Magnum-12b-v2.5-kto. So far it's good at roleplay, instruct and NSFW stuff.

JomSpoons commented 3 months ago

@YakuzaSuske No luck in finding a Exllama2 version of the model, but I appreciate the recommendation for Magnum and will give that a try. Is there anywhere I can find more models in a similar vein to these two? I'd like to stay up to date with that sort of thing.

YakuzaSuske commented 3 months ago

@JomSpoons Your best bet is the Local llama reddit. https://www.reddit.com/r/LocalLLaMA/

I always hang out there since they post almost every day about Ai news, Drama, new model releases, etc.

Here is the model releases posts https://www.reddit.com/r/LocalLLaMA/?f=flair_name%3A%22New%20Model%22

oobabooga / text-generation-webui