AI Agent framework Semantic Convetion

gyliu513 commented 3 weeks ago

Area(s)

area:gen-ai

Is your change request related to a problem? Please describe.

There are now many agent frameworks for GenAI, including crewai, autogen etc. Those agents will include many different components, like agent, tools, tasks etc, we need a semantic convention for those resources as well for agents.

@lmolkova @nirga @drewby @karthikscale3 ^^

Describe the solution you'd like

Extend the GenAI Semantic Convention to cover agents

Describe alternatives you've considered

No response

Additional context

No response

lmolkova commented 3 weeks ago

FYI: I'm baking some early prototype here https://github.com/microsoft/opentelemetry-semantic-conventions/pull/3, I'll re-send the PR against this repo after a bit more prorotyping

gyliu513 commented 3 weeks ago

I did some test with crewai and Instana observability at https://gyliu513.github.io/jekyll/update/2024/10/22/crewai-observability.html, we may need some agent framework level semantic convention.

Also did some test with langtrace, it can also capture some tracing and metrics when I was using crewai based on langchain.

Screenshot 2024-10-29 at 3 18 22 PM

@karthikscale3 has some demo for agent observability with langtrace as well.

@lmolkova it is great you have a PR under-going, will take a look, thanks!

lmolkova commented 3 weeks ago

wow, this looks awesome!

I'm struggling with a couple of things and hope to gets your thoughts on a few big things:

what's the scope: client framework, client to service-side agent, multi-agent story. How may layers do we need?
the level of unification - e.g. task execution is the same as openai assistant run and will be the same as azure AI agent run - how far do we need to go in attempts to unify

Essentially my main worry is that I don't fully understand what exactly we want to unify in semconv or what we want the conventions for.

There is always an alternative that the LLM client level is unified but frameworks do some extra stuff which does not need much consistency. LMK what you think.

karthikscale3 commented 3 weeks ago

Thanks for starting the thread @gyliu513 . Shown below the span graph on Langtrace for CrewAI. Generally speaking, here's a general pattern that's emerging across agentic frameworks:

[1] Sessions - Each session can have multiple agents working independently or together to perform a bunch of tasks. [2] Agents - Different agents that can do different tasks [3] Agent Config - Sequential, Hierarchical, Networked [4] Tools - Tools that agents have access to [5] Tasks - Tasks defined for agents

Based on our experience at Langtrace, developers like to see: [1] traces that are isolated and grouped at these high level constructs (agents, tools, tasks etc.) [2] see the relationship between these constructs per session [3] see metadata related to each one of these constructs

With all the above requirements in mind, we designed our instrumentation for crewai and other agentic frameworks we support. I think we can come up with sem conv for these high level constructs that are common for these agentic frameworks. Let me know what you all think.

gyliu513 commented 3 weeks ago

[1] Sessions - Each session can have multiple agents working independently or together to perform a bunch of tasks. [2] Agents - Different agents that can do different tasks [3] Agent Config - Sequential, Hierarchical, Networked [4] Tools - Tools that agents have access to [5] Tasks - Tasks defined for agents

@karthikscale3 good summary, thanks!

@lmolkova comments for this? I think may answer your above question, hope it is clear.

drewby commented 1 week ago

This paper may be useful inputs for this discussion: https://arxiv.org/abs/2411.05285

In particular, Fig 7 takes a stab at generalizing the various steps that may appear in a Trace.

open-telemetry / semantic-conventions