coolbutuseless / rllama

Minimal R wrapper for llama.cpp
MIT License
55 stars 4 forks source link

rllama

rllama wraps llama.cpp - a Large Language Model (LLM) running on the CPU. (Reference paper)

This is a minimally-viable-product which accepts input and produces output, but the quality, interface, capabilities and configurability are all very (very!) basic.

whisper.cpp

Future (contributions welcomed)

Installation

You can install from GitHub with:

# install.package('remotes')
remotes::install_github('coolbutuseless/rllama')

There only dependencies are:

Downloading a model

To get started, I suggest grabbing the Vicuna model ggml-vic7b-q5_0.bin from here.

This is a small model (\~5GB) with 7Billion parameters quantized to 5bits per parameter.

Any other model supported by llama.cpp should work. Check out the list of supported models on the llama.cpp github page.

Model versions

Note: This package uses the llama.cpp code from about the 25 May 2023.

The quantization formats (e.g. Q4, Q5 and Q8) have all changed within the last month.
Any older model files you have will probably not work with the latest llama.cpp. You’ll either have to requantize your models, or just download one in the appropriate format (e.g. from here).

Platform notes:

This package has only been tested on macOS so please let me know of any issues. PRs welcomed.

Using rllama

library(rllama)

# Initialise llama.cpp with built-in model
ctx <- llama_init("/Users/mike/projectsdata/llama.cpp/ggml-vic7b-q5_0.bin")
llama(ctx, prompt = "The apple said to the banana", n = 400)
#> , "You're not as smart as I am."
#> The banana replied, "That's okay. I'm just a fruit and you're a computer program. What do you expect?"
#> The apple said, "I can do things like tell jokes and play games that you can't."
#> The banana said, "Well, I can be used for making smoothies and baking cakes."
#> The apple said, "That may be true but at least I have some intelligence."

Licenses

Acknowledgements