ggerganov / llama.cpp

LLM inference in C/C++
MIT License
64.96k stars 9.32k forks source link

Feature Request: Simplified conversation mode interface #9134

Open ericcurtin opened 3 weeks ago

ericcurtin commented 3 weeks ago

Prerequisites

Feature Description

I was using a minor fork of llama.cpp (https://github.com/containers/ramalama) that had an interface like this:

$ ramalama run granite-code
> Write a hello world application in python

print("Hello World")

>

Now I've moved back to using upstream and we have an interface like this:

$ ramalama run granite-code
<|im_start|>system
You are a helpful assistant<|im_end|>

> Tell me a joke
Why did the tomato turn red?
Because it saw the salad dressing!
<|im_end|>

>

It's less clean... Could we either by default or as a command-line arg remove the "<|im_start|>system", "You are a helpful assistant<|im_end|>", "<|im_end|>" bits?

Motivation

Simplified interface that's more aesthetic

Possible Implementation

Change default behavior, or introduce a new command line flag to not print these things

ericcurtin commented 1 week ago

Maybe this:

--in-prefix "" --in-suffix "" --no-display-prompt

solves this issue...