ggerganov / llama.cpp

LLM inference in C/C++
MIT License
64.83k stars 9.29k forks source link

Feature Request: Reintroduce chat / instruct templates #8681

Closed ericcurtin closed 3 days ago

ericcurtin commented 1 month ago

Prerequisites

Feature Description

It was removed:

https://github.com/ggerganov/llama.cpp/pull/7675 https://github.com/ggerganov/llama.cpp/issues/7757

We were finding this feature extremely useful.

Motivation

We were actively using it until it was removed.

Possible Implementation

https://github.com/ggerganov/llama.cpp/pull/7675

ggerganov commented 1 month ago

Use the -cnv flag instead - it supports all models with defined chat templates

ericcurtin commented 1 month ago

Building master branch now,

I used to run something like this:

$ llama-main -m /models/granite --log-disable --instruct
> Tell me about Georgi
 George Orwell (185-194) was an English writer and political activist. He is best known for his dystopian novel "1984" which was published in 194. The book is considered a masterpiece of modernist fiction. Orwell's most famous work, "1984," was published in 194. The book is considered a masterpiece of modernist fiction. Orwell's most famous work, "1984," was published in 194. He is also known for his essays and his political activism, which included the writing of "Animal Farm."
>

nice and simple gave me a nice little Ollama/ChatGPT type terminal interface. So now I try (on master):

$ llama-cli -m /models/granite --log-disable -cnv -p "You respond to instructions"
<|im_start|>system
You respond to instructions<|im_end|>

> Who are ABBA?
ABBA were a Swedish rock band formed in 1972. The group consisted of Agnetha Fältskog, Björn Ulvaeus, Benny Andersson, and Anders Björk. They are best known for their hit songs "Waterloo", "Dancing Queen", and "Yes, Please".<|im_end|>
>

I really don't want noise like:

<|im_start|>system
You respond to instructions<|im_end|>

or

<|im_end|>

anyway to get rid of this noise like --instruct didn't show?

ggerganov commented 1 month ago

Add --log-disable. The <|im_end|> token should not appear - if it does, it likely means you are using a model that hasn't marked this token as special, so you might want to reconvert your model.

ericcurtin commented 1 month ago

--log-disable was added. I appreciate the advice, this is something from HuggingFace, I can't expect users to reconvert nor do I know how to. With --instruct this was not required, it seemed perfect for all the models I was using.

ericcurtin commented 1 month ago

Like without some clean TUI interface for all models, like we had for "--instruct", I'm not sure what the use case is for "-cnv" ... Like if the use-case was testing, interactive versions aren't really useful for testing at all, can't be scripted.

github-actions[bot] commented 3 days ago

This issue was closed because it has been inactive for 14 days since being marked as stale.