Standardize the structure of the content of a prompt / completion

Area(s)

area:gen-ai

Is your change request related to a problem? Please describe.

We need to standardize the structure of prompt and completion - This should be standardized to the OpenAI structure.

Describe the solution you'd like

Use the OpenAI structure to standardize it.

Describe alternatives you've considered

No response

Additional context

This was discussed during the Nov 7th GenAI SIG meeting - https://docs.google.com/document/d/1EKIeDgBGXQPGehUigIRLwAUpRGa7-1kXB736EaYuJ2M/edit?tab=t.0

Please recognize I am playing devil's advocate and have less experience than most in the team in practice, also I may misunderstand what this issue means as I wasn't on the last call. So, take this feedback as a grain of salt, if I misunderstood the goal from reading the description.

I understand the idea here is making genai semantics for log event (aka request response formats) the same as openai ones. So, you can refer to the openai openapi spec when in doubt of the schema. This could be helpful for those doing indexing, or search.

My first thought is we should solicit thoughts from folks who maintain production systems who attempt similar openai normalization for portability e.g. litellm @ishaan-jaff or vLLM @simon-mo. Even if these aren't driven by observability, they would have many practical intersections.

Personally, I can see a lot of value in continuing investment of semantics that relate to openai, and I also have noticed openai-based portability mechanisms pop up, one discussed below which we can pull into another issue. We have at least a couple choices

continue to refine openai semantic specifications as an option for structures including completions, embeddings etc
fold openai semantic specifications as the only prompt/completion (log event) structure.

We have some concerns we'd want to visit technically and from a neutrality viewpoint if we went for the latter

what happens when openai changes its structure (think v3)?
would we also follow the same with normal attributes, or do this only for log structures?
if we were to do this, should we not for the sake of otel vendor neutrality, want buy in from other services such as bedrock, anthropic etc? What sort of "cloud quorum" would make this a vendor neutral decision?

Practicalities of openai portability via `extra_body`

I actually found this issue because I was wondering how we would model extra_body which is already in use to try to allow tunneling of parameters to other clouds or even decorating in configuration such as for guardrails. I was looking to see if we were thinking to edit openai's semantics (here) to support it, and what to do with data overlap concerns (since extending the body is in the body). If this is totally independent, ask me to kick this part to a different issue.

For example, in our python test data, we use openai's extra_body field, which helps pass more parameters through for various reasons. I have seen these in use in azure openai, langchain, vllm, and litellm code or issues and can cite them as needed. However helpful these are to extend things, they are undocumented and not in the openai openapi spec.

Interestingly in litellm/langfuse, I ran into an example of using extra_body for trace ID propagation!

Thanks @codefromthecrypt for sharing your thoughts. We should definitely solicit thoughts from code owners of other related projects. For context, I created this issue for tracking since we couldn't find volunteers to work on this during our last SIG call. But, in terms of standardizing to OpenAI spec, in my humble opinion, I think it's a decent middle term strategy for us. And the reason for this is,

LLM service providers and model providers are starting to converge towards OpenAI spec: ex:
1. https://github.com/aws-samples/bedrock-access-gateway - AWS allowing devs to use Bedrock models using the openai client.
2. Google allowing access to Gemini models using the openai client - https://developers.googleblog.com/en/gemini-is-now-accessible-from-the-openai-library/
3. A wide range of open source models like mistral and llama family of models are mostly adopted using the openai client.

Now the main concern right now is the other closed model providers i.e. anthropic, cohere etc. where the have their own clients for accessing their models. While the clients resemble openai's client in many ways, they are also quite different when it comes to certain API parameters, data structures etc. This is where proxies like liteLLM come in to the picture. The good thing about proxies is, the litellm client for instance kinda looks very similar to the openai spec. In fact the liteLLM OTEL instrumentation code we had created (Langtrace) was mostly a copy paste of the openai OTEL instrumentation.

My general thought here is, the industry is converging towards the openai spec based on the above data points which is why I think it's a decent middle term strategy. But, what happens if openai changes the spec is a very valid concern. I think soliciting thoughts from other OSS code owners will be a good way to converge towards a standard that works for all of us.

open-telemetry / semantic-conventions