parakeet-nest / parakeet

🦜🪺 Parakeet is a GoLang library, made to simplify the development of small generative AI applications with Ollama 🦙.
https://parakeet-nest.github.io/parakeet/
MIT License
56 stars 4 forks source link

Implement LLM spans as defined in opentelemetry semantic conventions 1.27.0 #2

Open codefromthecrypt opened 3 weeks ago

codefromthecrypt commented 3 weeks ago

This is a tall ask, but there are spans and metrics defined in opentelemetry for LLMs (referring intentionally to 1.27.0 as it is what most vendors implement and main may be unstable)

https://github.com/open-telemetry/semantic-conventions/blob/v1.27.0/docs/gen-ai/gen-ai-spans.md https://github.com/open-telemetry/semantic-conventions/blob/v1.27.0/docs/gen-ai/gen-ai-metrics.md

Instrumentation will be great to show where time is spent. There will be ambiguities in some cases, so a start with basic completion as a span is probably best. Metrics and more ambigious things like tool calls can come later, and I would have suggestions on that.

I have been doing examples with ollama and otel-tui (to have both sides simple and mostly go)

Here's an example of agiflow's span. Ignore anything that doesn't start with gen_ai.

Screenshot 2024-08-19 at 7 13 26 AM
import os
from agiflow import Agiflow
from openai import OpenAI
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

# Initialize otel exporter and AGIFlow instrumentation
app_name = "agiflow-python-ollama"
otlp_endpoint = os.getenv("OTEL_EXPORTER_OTLP_TRACES_ENDPOINT", "http://localhost:4318/v1/traces")
otlp_exporter = OTLPSpanExporter(endpoint=otlp_endpoint)
Agiflow.init(app_name=app_name, exporter=otlp_exporter)

def main():
    ollama_host = os.getenv('OLLAMA_HOST', 'localhost')
    # Use the OpenAI endpoint, not the Ollama API.
    base_url = 'http://' + ollama_host + ':11434/v1'
    client = OpenAI(base_url=base_url, api_key='unused')
    messages = [
      {
        'role': 'user',
        'content': '<|fim_prefix|>def hello_world():<|fim_suffix|><|fim_middle|>',
      },
    ]
    chat_completion = client.chat.completions.create(model='codegemma:2b-code', messages=messages)
    print(chat_completion.choices[0].message.content)

if __name__ == "__main__":
    main()

I mentioned this one as the span name, attributes and span events are all completely compatible with the specs. langtrace and openlit are both also nearly 100pct compat. openllmetry is pretty close except for how prompt/completion are modeled. I believe if you instrument parakeet, it will be the first go library to use the specs, and that by itself is exciting.

codefromthecrypt commented 3 weeks ago

here's an example recent PR for langchaingo https://github.com/tmc/langchaingo/pull/944