openai / openai-python

The official Python library for the OpenAI API
https://pypi.org/project/openai/
Apache License 2.0
23.06k stars 3.25k forks source link

Commit a7ebc26 disables Open Telemetry's httpx instrumentation in some scenarios #1144

Open palvarezcordoba opened 9 months ago

palvarezcordoba commented 9 months ago

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

This commit a7ebc26, which was introduced in PR #966, for release v1.3.9, disables httpx instrumentation in some cases.

Specifically, it is disabled when openai is imported before instrumenting httpx. This is because opentelemetry.instrumentation.httpx .HTTPXClientInstrumentor._instrument creates subclasses of httpx.Client and httpx.AsyncClient. And then replaces the original clients with those subclasses, which adds telemetry. In the above commit, openai creates SyncHttpxClientWrapper and AsyncHttpxClientWrapper subclasses of httpx's clients.

That means that when we instrument first, and then import openai, the client wrappers inherit from the instrumented clients. When we import openai first, and then instrument httpx, the wrapper clients inherit from the original httpx clients.

Maybe this should be addressed at the opentelemetry-python-contrib side, but even so, v1.3.9 broke telemetry.

Could the change implemented in #966 be implemented in a backward-compatible way? If not, at the very least, the changelog should be updated, warning about this.

To Reproduce

  1. Import openai
  2. Use HTTPXClientInstrumentor from opentelemetry.instrumentation.httpx to instrument httpx.

Code snippets

Running this, openai httpx requests are instrumented:

from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor, _InstrumentedClient
HTTPXClientInstrumentor().instrument()
from openai._base_client import SyncHttpxClientWrapper
assert issubclass(SyncHttpxClientWrapper, _InstrumentedClient)

But this version is not instrumented, and the assertion fails:

from openai._base_client import SyncHttpxClientWrapper
from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor, _InstrumentedClient
HTTPXClientInstrumentor().instrument()
assert issubclass(SyncHttpxClientWrapper, _InstrumentedClient)

OS

any

Python version

any

Library version

openai from v1.3.9.0 to v1.12.0

rattrayalex commented 9 months ago

Thank you for reporting, @palvarezcordoba !

cc @RobertCraigie , can you take a look?

RobertCraigie commented 9 months ago

Thanks for the detailed report and for bisecting @palvarezcordoba! Unfortunately I don't think there's anything we can realistically do here and the fact that open telemetry worked independently of the import order was pure coincidence.

It looks like opentelemetry-python-contrib supports instrumenting individual client instances as well which should work with the openai SDK independently of the import order if instrumenting before importing isn't feasible for certain cases.

I think we'll want to add a section to the README.md showcasing instrumentation and mentioning these points cc @rattrayalex

palvarezcordoba commented 9 months ago

@RobertCraigie Yeah, I didn't expect the issue to be solved here, although it would have been nice. As I said, this isn't a bug, but v1.3.9 broke telemetry. Shouldn't this be stated in the changelog?

Writing an example in the README would be very good. I was going to instrument a client like in the link you shared, but I have some uncertainty about that.

Your default clients are in src/openai/_base_client.py. Since it starts with an underscore, I assume it is not considered part of the public API and should not be relied upon. So, I must not create instances of those. And I don't want to make my own clients, because if Openai makes more changes to their clients, I'm going to be using a different version.

For that reason, an example would be appreciated.

Thanks, @rattrayalex and @RobertCraigie for your fast response.

RobertCraigie commented 9 months ago

Ah of course, you shouldn't have to touch anything in _base_client, here's an example that should work (I haven't tested it):

import httpx
from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
from openai import OpenAI

http_client = httpx.Client(transport=telemetry_transport)
HTTPXClientInstrumentor.instrument_client(http_client)

openai_client = OpenAI(http_client=http_client)

# use `openai_client` as normal

Shouldn't this be stated in the changelog?

Sure we can update the changelog, the only reason it wasn't included in the first place is that we were unaware it would break this case.

pamelafox commented 9 months ago

@RobertCraigie Where does telemetry_transport come from in that code? I'm testing this out in our samples, as I'm not seeing good traces for the OpenAI calls currently.

RobertCraigie commented 9 months ago

Ah sorry it comes from this example: https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation/opentelemetry-instrumentation-httpx#using-transports-directly

import httpx
from opentelemetry.instrumentation.httpx import (
    SyncOpenTelemetryTransport,
)

transport = httpx.HTTPTransport()
telemetry_transport = SyncOpenTelemetryTransport(transport)
tonybaloney commented 8 months ago

Ah sorry it comes from this example: https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation/opentelemetry-instrumentation-httpx#using-transports-directly

import httpx
from opentelemetry.instrumentation.httpx import (
    SyncOpenTelemetryTransport,
)

transport = httpx.HTTPTransport()
telemetry_transport = SyncOpenTelemetryTransport(transport)

Sadly it's a bit more involved.

HTTPClient (or AsyncHTTPClient) needs base_url set which is an argument for OpenAIClient but a factory for AzureOpenAIClient, you need limits set to DEFAULT_LIMITS, you need follow_redirects needs to be enabled.

This seems just as fragile as HTTPXClientInstrumentor.instrument_client(openai_client._client)

tonybaloney commented 8 months ago

I'm using the opentelemetry-instrumentation-openai for now, this works well and puts the patch at the API level instead of relying on underlying implementation details