cfahlgren1 / observers

A Lightweight Library for AI Observability
148 stars 14 forks source link

[FEAT] Add support for `llama-cpp-python` #7

Open davidberenstein1957 opened 4 days ago

davidberenstein1957 commented 4 days ago

We want to observe interactions of llama-cpp, try to get inspiration from https://github.com/cfahlgren1/observers/blob/main/src/observers/observers/models/openai.py

from llama_cpp import Llama
from oberservers.observers.model.llama_cpp import wrap_llama_cpp
llm = Llama(
      model_path="./models/7B/llama-model.gguf",
      # n_gpu_layers=-1, # Uncomment to use GPU acceleration
      # seed=1337, # Uncomment to set a specific seed
      # n_ctx=2048, # Uncomment to increase the context window
)
llm = wrap_llama_cpp(llm)
output = llm(
      "Q: Name the planets in the solar system? A: ", # Prompt
      max_tokens=32, # Generate up to 32 tokens, set to None to generate up to the end of the context window
      stop=["Q:", "\n"], # Stop generating just before the model would generate a new question
      echo=True # Echo the prompt back in the output
) # Generate a completion, can also call create_completion
nchen2211 commented 3 hours ago

Hello,

I see this ticket hasn't been assigned to anyone. I'd like to pick up this ticket and contribute. Thanks!