Closed saul-jb closed 1 year ago
The output you mentioned here seems to come out of the llama.cpp
code itself and not from the model, so I don't think the wrapper is related to the issue you are facing.
I suggest you try using a llama chat
version of llama2 model instead, like this one for example: https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF
From my own experience, GeneralChatPromptWrapper
works better for most models.
You can also try setting a custom systemPrompt
parameter on a LlamaChatSession
.
If none of these work, try downloading a newer release of llama.cpp
and compiling it from source using this command:
node-llama-cpp download --release latest
Issue description
The Llama2 Templates appear to not work with llama2 models.
Expected Behavior
Using the
LlamaChatPromptWrapper
I would expect the model to produce a normal response.Actual Behavior
When I use
LlamaChatPromptWrapper
it seems to get stuck and produce the following output:I suspect this is a result of it not understanding the template/stop tokens.
Steps to reproduce
Use the 7B model: https://huggingface.co/TheBloke/Llama-2-7B-GGUF
Run the following code:
My Environment
node-llama-cpp
versionAdditional Context
The
GeneralChatPromptWrapper
seems to work normally with the exception of adding"\n\n### :"
to the stop tokens. Why does the general prompt wrapper work whereas the llama specific one doesn't? Is this an issue with the model file itself, e.g. bad conversion? Is there a better way to debug this?Related: https://huggingface.co/TheBloke/Llama-2-7B-GGUF/discussions/1
Relevant Features Used
Are you willing to resolve this issue by submitting a Pull Request?
Yes, I have the time, but I don't know how to start. I would need guidance.