Arize-ai / phoenix

AI Observability & Evaluation
https://docs.arize.com/phoenix
Other
3k stars 215 forks source link

[ENHANCEMENT] log traces via client with a list of spans #3531

Open Verizane opened 2 weeks ago

Verizane commented 2 weeks ago

Is your feature request related to a problem? Please describe. I had to customize the client method "log_traces" to match my provided Spans. Therefore I have written a custom method, which I want to provide to the phoenix code, as that is a good feature which might help others also. It only takes a list of Spans and sends it to phoenix, the same way as log_traces does, instead of going through the process of creating a TraceDataset beforehand.

Describe the solution you'd like Here is the method:

import gzip
import phoenix as px

from urllib.parse import urljoin
from opentelemetry.proto.collector.trace.v1.trace_service_pb2 import ExportTraceServiceRequest
from opentelemetry.proto.common.v1.common_pb2 import AnyValue, KeyValue
from opentelemetry.proto.resource.v1.resource_pb2 import Resource
from opentelemetry.proto.trace.v1.trace_pb2 import ResourceSpans, ScopeSpans
from phoenix.trace.otel import encode_span_to_otlp
from phoenix.trace.schemas import Span

def log_traces_from_spans(client: px.Client, spans: list[Span], project_name: str) -> None:
        """
        Logs traces from a list of spans to the Phoenix server.

        Args:

            spans (list[Span]): A list of spans with the traces to log to
                the Phoenix server.
            project_name (str, optional): The project name under which to log the evaluations.
                This can be set using environment variables. If not provided, falls back to the
                default project.

        Returns:
            None
        """
        project_name = project_name
        otlp_spans = [
            ExportTraceServiceRequest(
                resource_spans=[
                    ResourceSpans(
                        resource=Resource(
                            attributes=[
                                KeyValue(
                                    key="openinference.project.name",
                                    value=AnyValue(string_value=project_name),
                                )
                            ]
                        ),
                        scope_spans=[ScopeSpans(spans=[encode_span_to_otlp(span)])],
                    )
                ],
            )
            for span in spans
        ]
        for otlp_span in otlp_spans:
            serialized = otlp_span.SerializeToString()
            data = gzip.compress(serialized)
            client._session.post(
                urljoin(client._base_url, "/v1/traces"),
                data=data,
                headers={
                    "content-type": "application/x-protobuf",
                    "content-encoding": "gzip",
                },
            ).raise_for_status()
mikeldking commented 1 week ago

@Verizane seems like a nice convenience. Can I ask a bit about how you are generating the list of spans?

Just some background on the TraceDataset - we designed it that way to encapsulate the evals of the traces as well and the dataframe centric approach we take with python code (Our evals run on dataframes currently). Totally understand the above however, just wanted to give you some context.