openlit / openlit

Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Management, Vault, Playground. 🚀💻 Integrates with 30+ LLM Providers, VectorDBs, Frameworks and GPUs.
https://docs.openlit.io
Apache License 2.0
861 stars 73 forks source link

[Feat]: Add Option to Disable Logging of Image in Instrumentors #502

Open chriskhanhtran-verisk opened 5 days ago

chriskhanhtran-verisk commented 5 days ago

🚀 What's the Problem?

Currently, the instrumentors log base64-encoded image URLs to OpenTelemetry (OTel).

Example messages:

messages = {
    "role": "user",
    "content": [
        {"type": "text", "text": "Describe the image for me"},
        {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}},
    ]
}

Code:

content_str = ", ".join(
    # pylint: disable=line-too-long
    f'{item["type"]}: {item["text"] if "text" in item else item["image_url"]}'
    if "type" in item else f'text: {item["text"]}'
    for item in content
)

This creates two issues:

  1. If the image is large, the OTel exporter can fail with a 413 (Payload Too Large) error.
  2. Logging image bytes or base64 data doesn’t provide meaningful insights and can add unnecessary load.

💡 Your Dream Solution

Introduce a configurable flag to disable logging of images, or modify the behavior to log only text messages by default. This would prevent large binary data from being sent to OTel and streamline the logging process.

Proposed Code Change:

Original (link to code):

if isinstance(content, list):
    content_str = ", ".join(
        f'{item["type"]}: {item["text"] if "text" in item else item["image_url"]}'
        if "type" in item else f'text: {item["text"]}'
        for item in content
    )
    formatted_messages.append(f"{role}: {content_str}")
else:
    formatted_messages.append(f"{role}: {content}")

Proposed Solution:

if isinstance(content, list):
    content_str = ", ".join(
        f'text: {item["text"]}' for item in content if "text" in item
    )
    formatted_messages.append(f"{role}: {content_str}")
else:
    formatted_messages.append(f"{role}: {content}")

This update would ensure that only text content is logged, preventing large, unnecessary data from being exported.

patcher9 commented 5 days ago

Hey @chriskhanhtran-verisk Yeah makes sense toa dd this, Since you already know the change, want to raise the PR aswell? (If not I can surely do it too :) )