antoni0z commented 3 weeks ago

Bug Report

Describe the bug

When attempting to retrieve traces or spans from a Phoenix server using the Python client, a 500 Internal Server Error is encountered. This occurs when executing client.get_trace_dataset() or client.get_spans_dataframe() in a notebook environment.

To Reproduce

Steps to reproduce the behavior:

Set up a Phoenix server using Docker with version 4.31.0. Use the following Docker Compose configuration:

phoenix:
 image: arizephoenix/phoenix:version-4.31.0
 ports:
   - 6006:6006
   - 4317:4317
 environment:
   - PHOENIX_WORKING_DIR=/mnt/data
 volumes:
   - phoenix_data:/mnt/data
 restart: always
 pull_policy: always

Start the Docker container using docker-compose up.

Install the Phoenix client with the following specification:

arize-phoenix==0.4.31; python_version >= "3.11" and python_version < "3.12"

In a Python notebook, execute the following code:

import phoenix as px
client = px.Client(endpoint="http://localhost:6006")
client.get_trace_dataset()

import phoenix as px
client = px.Client(endpoint="http://localhost:6006")
client.get_spans_dataframe()

Observe the 500 Internal Server Error.

Expected behavior

The client.get_trace_dataset() call should successfully retrieve trace data from the Phoenix server without encountering an internal server error.

Screenshots

Environment

OS: Ubuntu 22.04
Notebook Runtime: Jupyter
Phoenix Version: 4.31.0 (server and client)
Python Version: 3.11.9
Docker configuration:
- Image: arizephoenix/phoenix:version-4.31.0
- Ports: 6006:6006, 4317:4317
- Environment variables:
- PHOENIX_WORKING_DIR=/mnt/data
- Volumes: phoenix_data:/mnt/data

Additional context

Server logs show the following error:

INFO:     172.20.0.1:49718 - "POST /v1/spans?project_name=Cnt%20IA&project-name=Cnt%20IA HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
pyarrow.lib.ArrowInvalid: ("Could not convert '21' with type str: tried to convert to int64", 'Conversion failed for column attributes.metadata with type object')

This suggests a data type conversion issue when processing the trace data, specifically with the 'attributes.metadata' column.

The error occurs in the following file: /phoenix/env/phoenix/server/api/routers/v1/spans.py, line 120
The client also receives a warning: "The Phoenix server has an unknown version and may have warnings.warn("

Server trace logging configuration:

try:
   resource = Resource(attributes={
       ResourceAttributes.PROJECT_NAME: self._project_name
   })
   tracer_provider = trace_sdk.TracerProvider(resource=resource)
   phoenix_collector_endpoint = os.getenv("PHOENIX_COLLECTOR_ENDPOINT")
   if not phoenix_collector_endpoint:
       raise ValueError("PHOENIX_COLLECTOR_ENDPOINT environment variable is not set.")
   span_exporter = OTLPSpanExporter(endpoint=phoenix_collector_endpoint)
   span_processor = SimpleSpanProcessor(span_exporter=span_exporter)
   tracer_provider.add_span_processor(span_processor=span_processor)
   trace_api.set_tracer_provider(tracer_provider=tracer_provider)
   LangChainInstrumentor().instrument()
   print("Instrument successful")
except Exception as e:
   print(f"An error has occurred while instrumenting: {e}")

This configuration sets up the trace logging using OpenTelemetry and LangChain instrumentation.

There's a FutureWarning in the server logs about pandas Series.getitem behavior:

/phoenix/env/phoenix/trace/dsl/query.py:746: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

This warning suggests that the server code might be using an outdated pandas method, which could potentially lead to issues in future versions.

axiomofjoy commented 3 weeks ago

Thanks @antoni0z. This issue is caused by inconsistent types on metadata values. PyArrow is not able to handle.

antoni0z commented 3 weeks ago

As a quick fix for people who may experience the same error i managed to not get this error if I narrow it down using

filtered_query = SpanQuery("span_kind == 'LLM'").select(input = "input.value", output = "output.value", metadata = "metadata")

spans = client.query_spans(filtered_query)

This way if you exclude the problematic ones from the conversion to pyarrow it doesnt cause that error.

Arize-ai / phoenix

[BUG] inconsistently typed metadata prevents client methods from working #4420