rendezqueue / rendezllama

CLI for llama.cpp with various commands to guide, edit, and regenerate tokens on the fly.
ISC License
10 stars 1 forks source link
ai chatbot large-language-models llama llamacpp llm

rendezllama

Rendezllama is a text interface for running a local chatbot based on ggerganov's llama.cpp.

For now, there's just a command-line interface, but the plan is to make a progressive web app that connects with the chatbot running on a home server.

Chat CLI

Assuming you have the quantized weights already and can compile C++, you can try the assistant_plain example with a few commands:

# If undefined, assume the 7B model exists in a sibling llama.cpp/ dir.
MODEL="${MODEL:-../llama.cpp/models/7B/ggml-model-q4_0.gguf}"
# Make just creates a bld/ directory and invokes CMake to build there.
make
# Run with specific settings from a file. They can be given as flags too.
./bld/src/chat/chat \
  --x_setting example/prompt/assistant_plain/setting.sxpb \
  --thread_count 8 \
  --model "${MODEL}"

See the example/prompt/ directory for more interesting/whimsical examples.

Chat CLI Options

Chat CLI Commands

In the chat, most things you type will be prefixed with the protagonist's name and suffixed by the confidant's dialogue line. There are some special inputs and commands that help keep an infinite chat from going off the rails. Remember, the recent chat content is just a rolling prompt concatenated to the end of the priming prompt, so its quality is just as important!