Hi,
relative new to go-llama. I tried to replicate the example code but instead of console input, the prompt is coming from a post request.
I'm using Llama2 13b chat, I've used the convert.py file to convert the the 2 consolidated.0x.pth into a nice .gguf (f16) model.
Now the problem while using the predict method:
It only returns #. Also the callback function for the newly generated tokens returns #. Do I need to do some convertions?
As far as I can see in ./example/main.go file, there is nothing needed.
Hi, relative new to go-llama. I tried to replicate the example code but instead of console input, the prompt is coming from a post request.
I'm using Llama2 13b chat, I've used the convert.py file to convert the the 2 consolidated.0x.pth into a nice .gguf (f16) model. Now the problem while using the predict method:
It only returns #. Also the callback function for the newly generated tokens returns #. Do I need to do some convertions? As far as I can see in ./example/main.go file, there is nothing needed.
And here the instantiation of the model:
Thinks its trivial. Thanks