Closed loretoparisi closed 10 months ago
the model is missing some keys and count be converted to GGUF format
'rms_norm_eps'
A full set of Llama.cpp compatible .gguf files is available at
https://huggingface.co/maddes8cht/adept-persimmon-8b-base-gguf
and
https://huggingface.co/maddes8cht/adept-persimmon-8b-chat-gguf
For the moment, cuda accelleration seems not to work, so you need to use -ngl 0
with the cublas versions.
Exploring possibilities to support GGML / GGUF formats to run with Llama.cpp