open-telemetry / opentelemetry-python-contrib

OpenTelemetry instrumentation for Python modules
https://opentelemetry.io
Apache License 2.0
739 stars 612 forks source link

Add OpenAI example #3006

Closed codefromthecrypt closed 2 days ago

codefromthecrypt commented 1 week ago

Description

This adds an example project for OpenAI, which helps prevent copy/paste errors in an upcoming blog.

See https://github.com/open-telemetry/opentelemetry.io/pull/5575

Type of change

Please delete options that are not relevant.

How Has This Been Tested?

aspire

First, I ran aspire, exposing its UI and HTTP collector ports.

$ docker run --rm -it  -p 18888:18888     -p 4318:18890  --name aspire-dashboard     mcr.microsoft.com/dotnet/aspire-dashboard:9.0
--snip--
      Login to the dashboard at http://localhost:18888/login?t=e11b234df56c2ccd3bd81961a6053c6c. The URL may need changes depending on how network access to the container is configured.
--snip--

Then, I ran the example and navigated to http://localhost:18888/login?t=e11b234df56c2ccd3bd81961a6053c6c (from the logs)

Screenshot 2024-11-19 at 8 25 53 AM Screenshot 2024-11-19 at 8 26 01 AM

jaeger

First, I ran jaeger, exposing its UI and HTTP collector ports (on all interfaces).

$ docker run --rm -it  -p 16686:16686 -p 4318:4318 --name jaeger jaegertracing/jaeger:2.0.0 --set receivers.otlp.protocols.http.endpoint=0.0.0.0:4318

Next, I ran the example after setting the following to avoid errors exporting log events.

OTEL_LOGS_EXPORTER=console

Then, I navigated to http://localhost:16686/ and searched for a trace with service name "opentelemetry-python-openai"

Screenshot 2024-11-19 at 8 33 50 AM

Finally, I checked the logs of the example itself which had the log events:

{
    "body": "Overriding of current EventLoggerProvider is not allowed",
    "severity_number": "<SeverityNumber.WARN: 13>",
    "severity_text": "WARN",
    "attributes": {
        "code.filepath": "/Users/adriancole/oss/opentelemetry-python-contrib/instrumentation-genai/opentelemetry-instrumentation-openai-v2/example/.venv/lib/python3.12/site-packages/opentelemetry/_events/__init__.py",
        "code.function": "_set_event_logger_provider",
        "code.lineno": 196
    },
    "dropped_attributes": 0,
    "timestamp": "2024-11-19T00:33:18.386528Z",
    "observed_timestamp": "2024-11-19T00:33:18.386550Z",
    "trace_id": "0x00000000000000000000000000000000",
    "span_id": "0x0000000000000000",
    "trace_flags": 0,
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.28.2",
            "service.name": "opentelemetry-python-openai",
            "telemetry.auto.version": "0.49b2"
        },
        "schema_url": ""
    }
}
{
    "body": {
        "content": "Write a short poem on OpenTelemetry."
    },
    "severity_number": "<SeverityNumber.INFO: 9>",
    "severity_text": null,
    "attributes": {
        "gen_ai.system": "openai",
        "event.name": "gen_ai.user.message"
    },
    "dropped_attributes": 0,
    "timestamp": "2024-11-19T00:33:18.606173Z",
    "observed_timestamp": "2024-11-19T00:33:18.606180Z",
    "trace_id": "0xf2077fa38294b784aa119fcb138df4ed",
    "span_id": "0xcd777b27ee934846",
    "trace_flags": 1,
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.28.2",
            "service.name": "opentelemetry-python-openai",
            "telemetry.auto.version": "0.49b2"
        },
        "schema_url": ""
    }
}
{
    "body": {
        "index": 0,
        "finish_reason": "stop",
        "message": {
            "role": "assistant",
            "content": "In realms of code where shadows play,  \nOpenTelemetry lights the way.  \nWith traces flowing, metrics in tow,  \nIt captures the dance of data's flow.  \n\nAcross the clouds, in services vast,  \nIt weaves insights from the digital cast.  \nA tapestry rich, with context and care,  \nMonitoring systems, everywhere.  \n\nFrom coalitions of logs, a story unfolds,  \nPerformance and health in patterns retold.  \nIn this open embrace, we find our grace,  \nA path to observability, our guiding space.  \n\nSo let us embrace this tool so bright,  \nAs we chase down the bugs and errors in flight.  \nIn the universe of software, let\u2019s take our stand,  \nWith OpenTelemetry, we\u2019ll understand!"
        }
    },
    "severity_number": "<SeverityNumber.INFO: 9>",
    "severity_text": null,
    "attributes": {
        "gen_ai.system": "openai",
        "event.name": "gen_ai.choice"
    },
    "dropped_attributes": 0,
    "timestamp": "2024-11-19T00:33:21.444441Z",
    "observed_timestamp": "2024-11-19T00:33:21.444461Z",
    "trace_id": "0xf2077fa38294b784aa119fcb138df4ed",
    "span_id": "0xcd777b27ee934846",
    "trace_flags": 1,
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.28.2",
            "service.name": "opentelemetry-python-openai",
            "telemetry.auto.version": "0.49b2"
        },
        "schema_url": ""
    }
}

otel-tui

First, I ran otel-tui, exposing its HTTP collector port. As this is a terminal UI, there's no UI port to expose.

$ docker run --rm -it -p 4318:4318 --name otel-tui ymtdzzz/otel-tui:latest

Then, I ran the example and looked at the otel-tui console:

Screenshot 2024-11-19 at 8 29 11 AM

Does This PR Require a Core Repo Change?

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

codefromthecrypt commented 1 week ago

@lzchen @xrmx @drewby @svrnm So, I noticed no examples yet in this repo, so probably the style isn't best. Also, I am currently using requirements.txt, and not sure if we want to change this to bootstrap or not.

Finally, I didn't include how to run a specific collector, due in part that jaeger doesn't support log events and most current users of genai obs will want to see them.

This didn't take me long to do, so if we want to dump this PR or move it somewhere else, no big deal. Just I wanted to highlight things work at the moment, and will be cleaner once 1.28.2 is out.

codefromthecrypt commented 1 week ago

updated to grpc so it can work with aspire per the blog

Run aspire

$ docker run --rm -it -d \
>     -p 18888:18888 \
>     -p 4317:18889 \
>     --name aspire-dashboard \
>     mcr.microsoft.com/dotnet/aspire-dashboard:9.0
Unable to find image 'mcr.microsoft.com/dotnet/aspire-dashboard:9.0' locally
9.0: Pulling from dotnet/aspire-dashboard
c326c12854c9: Pull complete 
9fe7cf37e47f: Pull complete 
8fe9b832a0d5: Pull complete 
e346529f9f70: Pull complete 
ff84cb0139e9: Pull complete 
0f8505221124: Pull complete 
0f761c810d93: Pull complete 
1388b36a0103: Pull complete 
d2cac5ae0450: Pull complete 
Digest: sha256:4b762cb15ebc4237464514c40e07905eecb6887e5e2cc6c6c04e8676563ea298
Status: Downloaded newer image for mcr.microsoft.com/dotnet/aspire-dashboard:9.0
c6ac45a64af72963b0c36c3634bc08a7e35206d9964b79aedb5b6c22dfeffde7

Get the URL with api key to view it

$ docker logs aspire-dashboard
info: Aspire.Dashboard.DashboardWebApplication[0]
      Aspire version: 9.0.0+01ed51919f8df692ececce51048a140615dc759d
warn: Microsoft.AspNetCore.DataProtection.Repositories.FileSystemXmlRepository[60]
      Storing keys in a directory '/home/app/.aspnet/DataProtection-Keys' that may not be persisted outside of the container. Protected data will be unavailable when container is destroyed. For more information go to https://aka.ms/aspnet/dataprotectionwarning
warn: Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager[35]
      No XML encryptor configured. Key {70382104-040f-49f6-b90e-988e4aadc1b1} may be persisted to storage in unencrypted form.
info: Aspire.Dashboard.DashboardWebApplication[0]
      Now listening on: http://[::]:18888
info: Aspire.Dashboard.DashboardWebApplication[0]
      Login to the dashboard at http://localhost:18888/login?t=8d5484ea587968a000b293be38e5e572. The URL may need changes depending on how network access to the container is configured.
info: Aspire.Dashboard.DashboardWebApplication[0]
      OTLP/gRPC listening on: http://[::]:18889
info: Aspire.Dashboard.DashboardWebApplication[0]
      OTLP/HTTP listening on: http://[::]:18890
warn: Aspire.Dashboard.DashboardWebApplication[0]
      OTLP server is unsecured. Untrusted apps can send telemetry to the dashboard. For more information, visit https://go.microsoft.com/fwlink/?linkid=2267030
info: Aspire.Dashboard.Authentication.FrontendCompositeAuthenticationHandler[12]
      AuthenticationScheme: FrontendComposite was challenged.

Click on the trace Kapture 2024-11-15 at 16 27 32

codefromthecrypt commented 4 days ago

ps one glitch found in aspire, but only if you set OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=false Maybe someone over there can knock it out before the blog releases? https://github.com/dotnet/aspire/issues/6703

p.s. this isn't a problem in otel-tui who show N/A for logs without a body

codefromthecrypt commented 4 days ago

note: I switched to OTLP http, which lets us use smaller images and also is pretty universally supported.

Before, I read an out-dated link from aspire which made me think it only supported grpc. I wanted this example to work with aspire, so I switched only for that reason to grpc.

OTLP protocol, with the dashboard currently supporting only the OTLP/gRPC protocol. Configure applications to use the grpc protocol

In fact, it works fine with http, just you need to map the ports like so (-p 4318:18890):

$ docker run --rm -it  -p 18888:18888 -p 4318:18890  --name aspire-dashboard     mcr.microsoft.com/dotnet/aspire-dashboard:9.0
info: Aspire.Dashboard.DashboardWebApplication[0]
      Aspire version: 9.0.0+01ed51919f8df692ececce51048a140615dc759d
warn: Microsoft.AspNetCore.DataProtection.Repositories.FileSystemXmlRepository[60]
      Storing keys in a directory '/home/app/.aspnet/DataProtection-Keys' that may not be persisted outside of the container. Protected data will be unavailable when container is destroyed. For more information go to https://aka.ms/aspnet/dataprotectionwarning
warn: Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager[35]
      No XML encryptor configured. Key {37c9abe2-ecf4-4f06-95e2-e3795120b7d0} may be persisted to storage in unencrypted form.
info: Aspire.Dashboard.DashboardWebApplication[0]
      Now listening on: http://[::]:18888
info: Aspire.Dashboard.DashboardWebApplication[0]
      Login to the dashboard at http://localhost:18888/login?t=fc56cfd0fc3acfecda87f06ea2107d3d. The URL may need changes depending on how network access to the container is configured.
info: Aspire.Dashboard.DashboardWebApplication[0]
      OTLP/gRPC listening on: http://[::]:18889
info: Aspire.Dashboard.DashboardWebApplication[0]
      OTLP/HTTP listening on: http://[::]:18890
warn: Aspire.Dashboard.DashboardWebApplication[0]
      OTLP server is unsecured. Untrusted apps can send telemetry to the dashboard. For more information, visit https://go.microsoft.com/fwlink/?linkid=2267030
info: Aspire.Dashboard.Authentication.FrontendCompositeAuthenticationHandler[7]
      FrontendComposite was not authenticated. Failure message: Unprotect ticket failed

My main motivation is that I have a bunch of examples of different llm libraries, and they can all use base alpine+python images except things that require grpc. To use grpc you need to install gcc and another package, which slows the build down and makes the resulting image larger:

For example, the code in the example project here builds with the small image and does so fast as no platform packages are required.

# Use an alpine image to make the runtime smaller
FROM docker.io/python:3.12.7-alpine3.20
RUN python -m pip install --upgrade pip

COPY /requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt

COPY main.py /

CMD [ "opentelemetry-instrument", "python", "main.py" ]
codefromthecrypt commented 4 days ago

I added an option to run with docker. If that's too much, ask me to delete it!

codefromthecrypt commented 3 days ago

ok ready to review. I tried on three OTLP tools, one that doesn't support logs (jaeger v2)