Open mjwweb opened 7 months ago
Perhaps the chat_format templates should be stored in a configuration file.
In this way, users can edit the .conf to adapt to any model without changing the code.
@mjwweb Maybe you can try the chat_format alpaca
or snoozy
, they look similar.
@tastypear alpaca seems to be the closest but still doesn't work well. The model outputs a bunch of gibberish.
I found this template for deepseek on hugging face to refer to:
prompt = "Tell me about AI"
prompt_template = f'''You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.
### Instruction:
{prompt}
### Response:
'''
and I've tried to create my own definintion in llama_chat_format.py: (for reference) https://github.com/abetlen/llama-cpp-python/blob/f3b844ed0a139fc5799d6e515e9d1d063c311f97/llama_cpp/llama_chat_format.py
@register_chat_format("deepseek")
def format_deepseek(
messages: List[llama_types.ChatCompletionRequestMessage],
**kwargs: Any,
) -> ChatFormatterResponse:
_roles = dict(user="### Instruction:", assistant="### Assistant:")
_sep = "\n"
_system_message = f'''You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.'''
_messages = _map_roles(messages, _roles)
_messages.append((_roles["assistant"], None))
_prompt = _format_add_colon_single(_system_message, _messages, _sep)
return ChatFormatterResponse(prompt=_prompt)
and applying the chat_format:
llm = Llama(
model_path="./models/deepseek-llm-7b-base.Q8_0.gguf",
chat_format="deepseek",
n_ctx=2096,
n_threads=6,
n_gpu_layers=100
)
however I am getting this error thrown when trying to run inference:
File "/home/ubuntu/LLMs/ai/llm-env/lib/python3.10/site-packages/llama_cpp/llama_chat_format.py", line 61, in get_chat_completion_handler
return CHAT_HANDLERS[name]
KeyError: 'deepseek'
Where the error is occuring in llama_chat_format.py:
def get_chat_completion_handler(name: str) -> LlamaChatCompletionHandler:
return CHAT_HANDLERS[name]
I'm not sure if the new definition I added is being applied to the pip package, or if there is some other issue. I'm new to working in python environments so this all experimental. I have other models working as expected with the existing chat_format templates but am still trying to figure out how to create new definitions to register new templates.
@mjwweb I have added --chat-format
to the llama.cpp/examples/server/api_like_OAI.py
(not llama.cpp-python) --> my repo
You can edit chat-format.toml
to add your template.
For deepseek-coder-instruct, maybe like this:
[deepseek-coder-instruct]
prefix = "You are an AI programming assistant.\n"
system = ""
user = "### Instruction:{content}\n"
assistant = "### Response:{content}\n"
suffix = "### Response:"
then run:
./server -m model.gguf
and
python api_like_OAI.py --chat-format deepseek-coder-instruct
I haven't tested this template myself, but it should be easy to adjust to make it work.
Is your feature request related to a problem? Please describe. Request for a deepseek chat_format template
Additional context Deepseek prompt template found here: https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF