open-telemetry / semantic-conventions

Defines standards for generating consistent, accessible telemetry across a variety of domains
Apache License 2.0
272 stars 174 forks source link

Standardize the structure of the content of a prompt / completion #1557

Open karthikscale3 opened 6 days ago

karthikscale3 commented 6 days ago

Area(s)

area:gen-ai

Is your change request related to a problem? Please describe.

We need to standardize the structure of prompt and completion - This should be standardized to the OpenAI structure.

Describe the solution you'd like

Use the OpenAI structure to standardize it.

Describe alternatives you've considered

No response

Additional context

This was discussed during the Nov 7th GenAI SIG meeting - https://docs.google.com/document/d/1EKIeDgBGXQPGehUigIRLwAUpRGa7-1kXB736EaYuJ2M/edit?tab=t.0

codefromthecrypt commented 2 days ago

Please recognize I am playing devil's advocate and have less experience than most in the team in practice, also I may misunderstand what this issue means as I wasn't on the last call. So, take this feedback as a grain of salt, if I misunderstood the goal from reading the description.

I understand the idea here is making genai semantics for log event (aka request response formats) the same as openai ones. So, you can refer to the openai openapi spec when in doubt of the schema. This could be helpful for those doing indexing, or search.

My first thought is we should solicit thoughts from folks who maintain production systems who attempt similar openai normalization for portability e.g. litellm @ishaan-jaff or vLLM @simon-mo. Even if these aren't driven by observability, they would have many practical intersections.

Personally, I can see a lot of value in continuing investment of semantics that relate to openai, and I also have noticed openai-based portability mechanisms pop up, one discussed below which we can pull into another issue. We have at least a couple choices

We have some concerns we'd want to visit technically and from a neutrality viewpoint if we went for the latter

Practicalities of openai portability via extra_body

I actually found this issue because I was wondering how we would model extra_body which is already in use to try to allow tunneling of parameters to other clouds or even decorating in configuration such as for guardrails. I was looking to see if we were thinking to edit openai's semantics (here) to support it, and what to do with data overlap concerns (since extending the body is in the body). If this is totally independent, ask me to kick this part to a different issue.

For example, in our python test data, we use openai's extra_body field, which helps pass more parameters through for various reasons. I have seen these in use in azure openai, langchain, vllm, and litellm code or issues and can cite them as needed. However helpful these are to extend things, they are undocumented and not in the openai openapi spec.

Interestingly in litellm/langfuse, I ran into an example of using extra_body for trace ID propagation!

karthikscale3 commented 2 days ago

Thanks @codefromthecrypt for sharing your thoughts. We should definitely solicit thoughts from code owners of other related projects. For context, I created this issue for tracking since we couldn't find volunteers to work on this during our last SIG call. But, in terms of standardizing to OpenAI spec, in my humble opinion, I think it's a decent middle term strategy for us. And the reason for this is,

Now the main concern right now is the other closed model providers i.e. anthropic, cohere etc. where the have their own clients for accessing their models. While the clients resemble openai's client in many ways, they are also quite different when it comes to certain API parameters, data structures etc. This is where proxies like liteLLM come in to the picture. The good thing about proxies is, the litellm client for instance kinda looks very similar to the openai spec. In fact the liteLLM OTEL instrumentation code we had created (Langtrace) was mostly a copy paste of the openai OTEL instrumentation.

My general thought here is, the industry is converging towards the openai spec based on the above data points which is why I think it's a decent middle term strategy. But, what happens if openai changes the spec is a very valid concern. I think soliciting thoughts from other OSS code owners will be a good way to converge towards a standard that works for all of us.