Open tkalevra opened 1 year ago
i think you might be running incorrect/unsupported model format
Try running the vicuna model that is quantized to run with llama.cpp one that works is
try this https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g-GGML
I don't know if it generally works, but you downloaded only part 1 of 3 and also ommited the configs. The model card of that model also states
NOTE: This "delta model" cannot be used directly. Users have to apply it on top of the original LLaMA weights to get actual Vicuna weights. See https://github.com/lm-sys/FastChat#vicuna-weights for instructions.
I'm using Pi3141/gpt4-x-alpaca-native-13B-ggml.
hey @espressoelf i see you are helping a lot of people just wanted to say thanks
CONTEXT: I'm running in zorinOS(an ubuntu spinoff but what isn't these days..) installation was successful and the web ui is responsive on 127.1:7887
I've downloaded: https://huggingface.co/lmsys/vicuna-13b-delta-v1.1/blob/main/pytorch_model-00001-of-00003.bin and copied the .bin to the appropriate folder, which shows in the gui under "Load model"
When I click submit I receive an error in terminal:
any feedback or insights would be greatly appreciated please, or an alternate model for me to attempt please