[BUG] Traces Not Appearing in Phoenix Dashboard When Using FastAPI and LiteLLM with arize-phoenix-otel

rohanbalkondekar commented 2 days ago

I'm using Phoenix to trace LLM calls in a FastAPI application that utilizes LiteLLM. When running the application, LLM calls work correctly, and responses are returned as expected. However, traces are not appearing in the Phoenix dashboard when using the hosted Phoenix instance via arize-phoenix-otel.

When I run similar code in a Jupyter notebook or a synchronous script using the local Phoenix instance (arize-phoenix), tracing works correctly, and I can see the traces in the dashboard. This leads me to believe the issue might be related to context propagation in the asynchronous FastAPI application or with the hosted Phoenix tracing.

To Reproduce

Steps to reproduce the behavior:

Create a Minimal FastAPI Application

Directory Structure

tracing_test/
├── main.py
├── utils.py
├── test.py
└── requirements.txt

main.py

from typing import Optional
from pydantic import BaseModel, Field
from fastapi import FastAPI, HTTPException
from utils import call_llm
import asyncio

# Initialize FastAPI app
app = FastAPI()

class Message(BaseModel):
 role: str
 content: str

class QuestionRequest(BaseModel):
 messages: list[Message]
 language: Optional[str] = Field(
     "english", description="Language"
 )

@app.post("/chat/")
async def chat_lm(request: QuestionRequest):
 try:
     response_message = await asyncio.to_thread(
         call_llm, messages=[message.model_dump() for message in request.messages]
     )
     response = {"message": response_message}
     return response
 except Exception as e:
     raise HTTPException(status_code=500, detail=str(e)) from e

if __name__ == "__main__":
 import uvicorn

 uvicorn.run(app, host="0.0.0.0", port=5050)

utils.py

import os
import json
from typing import Any
from dotenv import load_dotenv

load_dotenv()

# Enable logging
import logging
logging.basicConfig(level=logging.DEBUG)
logging.getLogger('opentelemetry').setLevel(logging.DEBUG)

# Configure Phoenix tracing
from phoenix.otel import register

PHOENIX_API_KEY = "your_phoenix_api_key"  # Replace with your actual API key
os.environ["PHOENIX_CLIENT_HEADERS"] = f"api_key={PHOENIX_API_KEY}"

# Configure the Phoenix tracer
tracer_provider = register(
 project_name="my-llm-app",  # Default is 'default'
 endpoint="https://app.phoenix.arize.com/v1/traces",
 set_global_tracer_provider=True,
)

# Instrument LiteLLM
from openinference.instrumentation.litellm import LiteLLMInstrumentor
LiteLLMInstrumentor().instrument()

# Import litellm after instrumentation
from litellm import completion

def call_llm(messages: list[dict[str, str]]) -> dict[str, Any]:
 try:
     # LLM Model configuration
     kwargs = {
         "model": "gpt-4o",
         "messages": messages,
         "temperature": 0.1,
     }
     response = completion(**kwargs)
     return response.choices[0].message
 except Exception as e:
     # Handle exceptions
     raise Exception(f"Error calling LLM: {str(e)}") from e

test.py

import requests

api_url = 'http://0.0.0.0:5050/chat/'

payload = {
 "messages": [
     {
         "role": "user",
         "content": "Hey"
     }
 ]
}

try:
 response = requests.post(api_url, json=payload)
 response.raise_for_status()
 response_json = response.json()
 llm_response = response_json['message']['content']
 print(llm_response)
except requests.exceptions.RequestException as e:
 raise e

requirements.txt

pydantic
litellm
uvicorn
fastapi
gunicorn 
qdrant-client
python-dotenv

arize-phoenix
arize-phoenix-otel
openinference-instrumentation-litellm

Install Dependencies
```
pip install -r requirements.txt
```
Run the FastAPI Application
```
python main.py
```
Send a Test Request

In another terminal, run:
```
python test.py
```
This should print the LLM response (e.g., "Hello! How can I assist you today?") to the console, indicating that the LLM call was successful.
Check the Phoenix Dashboard
- Log in to the Phoenix dashboard at https://app.phoenix.arize.com/.
- Navigate to your project (my-llm-app).

Expected Behavior

I expect to see traces of the LLM calls and any instrumented operations in the Phoenix dashboard.
Traces should include spans for the HTTP requests handled by FastAPI and the calls made to the LLM via LiteLLM.

Actual Behavior

No traces appear in the Phoenix dashboard when using the FastAPI application with arize-phoenix-otel.
The application runs without errors, and LLM responses are returned as expected.
Debug logs do not show any OpenTelemetry errors or indications of span creation/export.

Environment

OS: Ubuntu 22.04.4 LTS

$ uname
Linux
$ lsb_release -a
No LSB modules are available.
Distributor ID:   Ubuntu
Description:  Ubuntu 22.04.4 LTS
Release:  22.04
Codename: jammy

Python Version: 3.10.12
Packages and Versions: All latest packages

The old approach (now deemed legacy) worked just fine arize-phoenix~=4.35.0

import phoenix as px
from phoenix.trace.openai import OpenAIInstrumentor

# Initialize OpenAI auto-instrumentation
OpenAIInstrumentor().instrument()
session = px.launch_app()

Additional Context

Working Scenario: When using local tracing with arize-phoenix and OpenAIInstrumentor, traces appear in the local Phoenix dashboard.

# In utils.py
from dotenv import load_dotenv

load_dotenv()

# Tracing the LLM calls locally
import phoenix as px
from phoenix.trace.openai import OpenAIInstrumentor

# Initialize OpenAI auto-instrumentation
OpenAIInstrumentor().instrument()
session = px.launch_app()

from litellm import completion

def call_llm(messages: list[dict[str, str]]) -> dict[str, Any]:
  # [LLM code as before]

Using the above code, traces appear as expected when running locally.

Attempted Solutions:

Set Global Tracer Provider: Used set_global_tracer_provider=True in the register() function.

tracer_provider = register(
 project_name="my-llm-app",
 endpoint="https://app.phoenix.arize.com/v1/traces",
 set_global_tracer_provider=True,
)

Instrument Order: Ensured that instrumentation occurs before importing and using litellm.

from openinference.instrumentation.litellm import LiteLLMInstrumentor
LiteLLMInstrumentor().instrument()
from litellm import completion

Debug Logging: Enabled debug logging for OpenTelemetry.

import logging
logging.basicConfig(level=logging.DEBUG)
logging.getLogger('opentelemetry').setLevel(logging.DEBUG)

Network Connectivity: Confirmed that the application can reach https://app.phoenix.arize.com/v1/traces (no firewall issues).
API Key Verification: Ensured the Phoenix API key is correct and properly set.
FastAPI Instrumentation: Attempted to instrument FastAPI for context propagation.
```
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
FastAPIInstrumentor().instrument_app(app)
```
- However, this did not resolve the issue.

Manual Spans: Tried adding manual spans in the call_llm function.

from opentelemetry import trace

tracer = trace.get_tracer(__name__)

def call_llm(messages: list[dict[str, str]]) -> dict[str, Any]:
 with tracer.start_as_current_span("call_llm"):
     # [LLM code as before]

Still, no traces appeared in the dashboard.

Observations:
- Despite these attempts, traces still do not appear in the Phoenix dashboard when using the hosted version with arize-phoenix-otel.
- There are no errors or warnings in the logs that indicate why the traces are not being sent or displayed.
- When running similar code in a Jupyter notebook or synchronous script (without FastAPI), traces are successfully sent to the Phoenix dashboard.

Questions

Is there additional configuration required when using arize-phoenix-otel with a FastAPI application that uses asynchronous code?
Could this issue be related to context propagation in asynchronous applications, and if so, how can it be addressed?
Are there any known issues or additional steps needed to get Phoenix tracing working in this setup with FastAPI and LiteLLM?
Are there compatibility issues with the versions of OpenTelemetry, Phoenix, or LiteLLM I am using?

Logs

Application Startup Logs:

🔭 OpenTelemetry Tracing Details 🔭
|  Phoenix Project: my-llm-app
|  Span Processor: SimpleSpanProcessor
|  Collector Endpoint: https://app.phoenix.arize.com/v1/traces
|  Transport: HTTP
|  Transport Headers: {'api_key': '****'}
|  
|  Using a default SpanProcessor. `add_span_processor` will overwrite this default.
|  
|  `register` has set this TracerProvider as the global OpenTelemetry default.
|  To disable this behavior, call `register` with `set_global_tracer_provider=False`.

INFO:     Started server process [PID]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:5050 (Press CTRL+C to quit)

Request Handling Logs:

No additional OpenTelemetry logs are generated during request handling, even with debug logging enabled.

Conclusion

I'm looking for guidance on how to resolve this issue. Any assistance or suggestions would be greatly appreciated. If there are any examples or documentation on integrating arize-phoenix-otel with FastAPI and LiteLLM, that would be very helpful.

Thank You

Thank you for your time and support.

Additional context

rohanbalkondekar commented 2 days ago

The initial issue of traces not appearing in the Phoenix dashboard was caused by package version mismatches in the environment. Specifically, strict version constraints in the requirements.txt file led to incompatibilities between the tracing packages and other dependencies.

By removing strict version constraints and allowing pip to install compatible versions, the package mismatches were resolved. This ensured that arize-phoenix-otel, openinference-instrumentation-litellm, and other dependencies worked together smoothly.

   - pandas~=2.2.0 
   - openai~=1.54.0
   - pydantic~=2.7.0
   - litellm~=1.52.0
   - uvicorn~=0.30.0 
   - fastapi~=0.111.0
   - gunicorn~=22.0.0 
   - qdrant-client~=1.9.0

   + pydantic
   + litellm
   + uvicorn
   + fastapi
   + gunicorn 
   + qdrant-client
   + python-dotenv
   + arize-phoenix
   + arize-phoenix-otel
   + openinference-instrumentation-litellm

After updating the packages, a serialization error occurred due to the inclusion of Message objects (which are not JSON serializable by default) in the messages list processed by the tracing instrumentation.

The problem was due to a serialization error caused by the inclusion of the tracing instrumentation from Phoenix and OpenInference in our FastAPI application. Specifically, the openinference.instrumentation.litellm module and its LiteLLMInstrumentor were causing issues when processing Message objects within the messages list sent to the LLM.

When the tracing instrumentation is active, it wraps around the litellm's completion function and attempts to serialize the messages list for tracing purposes. However, the Message objects in the messages list are not JSON serializable by default. This leads to serialization errors, preventing traces from being correctly exported to the Phoenix dashboard.

Fix: Converted Message objects to dictionaries using the .model_dump() method provided by Pydantic (which Message objects inherit from). This ensured all messages were JSON serializable and compatible with the tracing mechanism.

When using tracing instrumentation that processes data structures like messages, ensure all included objects are JSON serializable. For custom objects like Message, use methods like .model_dump() to convert them into dictionaries.

rohanbalkondekar commented 2 days ago

Reopened the issue because the package feels unstable. If there is some version mismatch or something, it simply stops working, no errors whatsoever. I need to delete the virtual environment and recreate a new one with all the latest packages and then it starts working.

Improved error logs would be helpful

Arize-ai / phoenix

[BUG] Traces Not Appearing in Phoenix Dashboard When Using FastAPI and LiteLLM with arize-phoenix-otel #5364