I've followed the prerequisites, I can't run red pajama 3B with llama.cpp, I think it's only available inside the ggml repo right?
But I went ahead anyway assuming gpt-llama.cpp does something to enable it.
I've placed the model like so ../llama.cpp/models/ggml/gpt-neox/rp-instruct-3b-v1-ggml-model-q4_0.bin
Running http://localhost:443/v1/models returns
Missing API_KEY. Please set up your API_KEY (in this case path to model .bin in your ./llama.cpp folder).
I'm not sure where to put this path.
Tried API_KEY=<path to model> npm start
Tried entering <path to model> in Swagger's Bearer token. Where do I set this API_KEY?
Edit: So I tried ggml but it's also not working? I'm confused how to run Red Pajama
./bin/gpt-neox -m ../../models/rp-instruct-3b-v1-ggml-model-q4_0.bin -p "How do I build a website?"
main: seed = 1684913741
gpt_neox_model_load: loading model from '../../models/rp-instruct-3b-v1-ggml-model-q4_0.bin' - please wait ...
gpt_neox_model_load: n_vocab = 50432
gpt_neox_model_load: n_ctx = 2048
gpt_neox_model_load: n_embd = 2560
gpt_neox_model_load: n_head = 32
gpt_neox_model_load: n_layer = 32
gpt_neox_model_load: n_rot = 80
gpt_neox_model_load: par_res = 0
gpt_neox_model_load: ftype = 2
gpt_neox_model_load: qntvr = 0
gpt_neox_model_load: ggml ctx size = 3572.54 MB
gpt_neox_model_load: memory_size = 640.00 MB, n_mem = 65536
terminate called after throwing an instance of 'std::length_error'
what(): basic_string::_M_create
Aborted
[fedorauser@W10JB1S9K3 build]$
I've followed the prerequisites, I can't run red pajama 3B with llama.cpp, I think it's only available inside the ggml repo right? But I went ahead anyway assuming gpt-llama.cpp does something to enable it. I've placed the model like so
../llama.cpp/models/ggml/gpt-neox/rp-instruct-3b-v1-ggml-model-q4_0.bin
Running http://localhost:443/v1/models returns
Missing API_KEY. Please set up your API_KEY (in this case path to model .bin in your ./llama.cpp folder).
I'm not sure where to put this path. TriedAPI_KEY=<path to model> npm start
Tried entering<path to model>
in Swagger's Bearer token. Where do I set this API_KEY?Edit: So I tried ggml but it's also not working? I'm confused how to run Red Pajama