🚀 Feature: instrument `instructor`

zilto commented 1 week ago

Which component is this feature for?

All Packages

🔖 Feature description

Add instrumentation for the library instructor which allows to get structured outputs from LLMs.

🎤 Why is this feature needed ?

Instructor works by patching the client, whether it's OpenAI, Cohere, Anthropic, etc. Currently, it seems to be incompatible with the OpenLLMetry instrumentation for these clients (no telemetry is captured).

After browsing, I found an openinference package for it, but it seems out of date / incompatible with the OpenLLMetry instrumentation I'm using because the application crashes with some library errors.

✌️ How do you aim to achieve this?

Would be nice to able to instrument instructor with the OpenLLMetry toolset!

🔄️ Additional Information

No response

👀 Have you spent some time to check if this feature request has been raised before?

[X] I checked and didn't find similar issue

Are you willing to submit PR?

None

nirga commented 1 week ago

Thanks @zilto! Yes we'll need to some black magic here to support that. OpenInference, unfortunately, doesn't follow semantic conventions, so it doesn't look like it will ever be compatible with OpenTelemetry. Would love it if you can assist with that!

skrawcz commented 1 week ago

@nirga I think this is really an ordering issue. You first:

turn on open telemetry,
import instructor.

If you do the reverse, then things break.

nirga commented 1 week ago

@skrawcz so you just need to initialize the sdk before importing instructor?

zilto commented 1 week ago

Took some debugging, but some notes:

Ordering matters. Telemetry behaves properly if the library (openai at least, others weren't thoroughly tested) is instrumented before creating the instructor client. Otherwise, code will run, but no telemetry will be logged.

This works

import openai
import instructor
from opentelemetry.instrumentation.openai import OpenAIInstrumentor

OpenAIInstrumentor().instrument()

instructor_client = instructor.from_openai(openai.OpenAI())

instructor_client.create(...)

This runs, but no telemetry is produced

import openai
import instructor
from opentelemetry.instrumentation.openai import OpenAIInstrumentor

instructor_client = instructor.from_openai(openai.OpenAI())

OpenAIInstrumentor().instrument()

instructor_client.create(...)

The issue I encountered yesterday seems specific to instructor + opentelemetry + notebook environment. The above approach works in a .py script, but fails in a .ipynb notebook.

instructor uses event loops that clash with the notebook's event loop, specifically when openai is instrumented (via openllmetry or openinference).

Conclusions

I'll open an issue on the instructor repo. Maybe all we need to do here is add a docs page? Feel free to close the issue

traceloop / openllmetry