Closed nsarrazin closed 1 year ago
My list:
Black-Engineer/llama-13b-pretrained-sft-do2-ggml-q4.bin Black-Engineer/oasst-llama13b-ggml-q4.bin Black-Engineer/oasst-llama-13b.bin Black-Engineer/oasst-llama-30b.bin eachadea/ggml-vicuna-13b-1.1-q4_1.bin eachadea/ggml-vicuna-13b-1.1-q4_2.bin eachadea/ggml-vicuna-13b-1.1-q4_3.bin eachadea/ggml-vicuna-7b-1.1-q4_0.bin eachadea/ggml-vicuna-7b-1.1-q4_1.bin eachadea/ggml-vicuna-7b-1.1-q4_2.bin
(Want to try these but not currently supported by llama.cpp afaik:
mongolian-basket-weaving/oasst-stablelm-7b-sft-v7-epoch-3-ggml-q4_2.bin mongolian-basket-weaving/oasst-stablelm-7b-sft-v7-epoch-3-ggml-q4_3.bin
Also, models released by The Bloke are worth supporting!)
My list:
Black-Engineer/llama-13b-pretrained-sft-do2-ggml-q4.bin Black-Engineer/oasst-llama13b-ggml-q4.bin Black-Engineer/oasst-llama-13b.bin Black-Engineer/oasst-llama-30b.bin eachadea/ggml-vicuna-13b-1.1-q4_1.bin eachadea/ggml-vicuna-13b-1.1-q4_2.bin eachadea/ggml-vicuna-13b-1.1-q4_3.bin eachadea/ggml-vicuna-7b-1.1-q4_0.bin eachadea/ggml-vicuna-7b-1.1-q4_1.bin eachadea/ggml-vicuna-7b-1.1-q4_2.bin
(Want to try these but not currently supported by llama.cpp afaik:
mongolian-basket-weaving/oasst-stablelm-7b-sft-v7-epoch-3-ggml-q4_2.bin mongolian-basket-weaving/oasst-stablelm-7b-sft-v7-epoch-3-ggml-q4_3.bin
Also, models released by The Bloke are worth supporting!)
How about WizardLM (https://github.com/nlpxucan/WizardLM) ? Can this new model be supported in Serge please? I heard it's very fast and generates good results.
I was going to say WizardLM, its been getting some good reviews.
https://huggingface.co/reeducator/vicuna-13b-free/discussions
It tries to reduce censorship
Google PaLM looks interesting. https://9to5google.com/2023/05/10/google-palm-2/
However, as far as I can tell, the framework is open source and might be available? But I can’t find any info on the training data. I’m also very new to AI and I might be looking in the wrong places for the data needed to be usable for Serge.
I'd really like to see support for Guanaco models. Edit: PR https://github.com/nsarrazin/serge/pull/334
from my testing, it seems like all the Q4_0 and the Q8_0 work the best/fastest. The K varieties take too long with CPU only. Honestly I think it is a toss up between Q4 and Q8. Both seem to run about the same within a margin of error.
CodeGen2.5
from my testing, it seems like all the Q4_0 and the Q8_0 work the best/fastest. The K varieties take too long with CPU only. Honestly I think it is a toss up between Q4 and Q8. Both seem to run about the same within a margin of error.
I get this odd error when trying to run Wizard.
llama.cpp: loading model from /usr/src/app/weights/Wizard-30B-Q4_1.bin
error loading model: unknown (magic, version) combination: 67676a74, 00000003; is this really a GGML file?
llama_init_from_file: failed to load model
Did you get this? How did you resolve?
WizardCoder would be nice to have!
Very interested to have the just released code llama : https://ai.meta.com/blog/code-llama-large-language-model-coding/
Fixed via #866
Hey everyone!
Just opening an issue to track which models people would like to see supported with Serge.
Are there any others you would like to see ?