.Net: New Feature: Allow use of an existing OpenAI Assistants thread when utilizing `AgentGroupChat`

joeyj commented 1 month ago

Currently, using AgentGroupChat with OpenAIAssistantAgent causes a new thread to be created and deleted for each chat. Since threads are an externally managed resource, is there a design where those can be managed outside of the chat? e.g., long-lived threads

crickman commented 1 month ago

@joeyj -Thanks for this suggestion. I'd like to explore what this might look like.

I struggle a bit in reconciling this requirement with existing behaviors / use-cases:

Any instance of an agent is allowed to participate in more than one conversation or chat.
AgentChat / AgentGroupChat doesn't have specific knowledge of any particular agent type.
Different assistant agents may certainly target different thread-ids for the same conversation. For example, nothing stops a user from targeting models in different Azure regions / different endpoints...or even mixing an Azure endpoint with an OpenAI endpoint. The framework does not constrain this.

Introducing ThreadId property to an assistant-agent breaks #1. Based on #2, AgentChat doesn't know what a thread-is...how to express this? And #3, how many thread-ids and for which agents?

We are working on a serialization feature that will allow the group-chat to be deserialized using the the same threads. This would allow a long-lasting conversation. It wouldn't support joining a new AgentChat to existing threads, however.

Thoughts?

joeyj commented 1 month ago

@crickman Thanks for the detailed reply.

It might help to give a bit more context on how we are using Semantic Kernel. Currently, we are running it in a stateless distributed service to support scenarios like SMS webhooks to communicate with our customers. We end up loading and persisting ChatHistory on every call. OpenAI Assistants look very attractive in solving this for us as it lets us remotely store our messages for their entire lifetime without having to load/persist all of them every time.

We could instead use OpenAIAssistantAgent directly and manage the threads ourselves but we'd miss out on utilizing AgentGroupChat for multi-agent scenarios when we operate in this stateless fashion. In theory, we could use the same thread for multiple agents directly in OpenAI Assistants which would remove the need for locally caching them in memory in Semantic Kernel.

Hopefully that context helps a bit.

Some thoughts on how this could be supported in Semantic Kernel:

Create a specialized AgentGroupChat like OpenAIAssistantThreadAgentGroupChat that allows for specifying a ThreadId during creation.
Create an options object specific to OpenAIAssistantAgent with nullable ThreadId that can be provided to methods that ultimately get or create channels.
Create a special-purpose Agent like OpenAIAssistantThreadAgent that is thread-specific.

microsoft / semantic-kernel

.Net: New Feature: Allow use of an existing OpenAI Assistants thread when utilizing `AgentGroupChat` #8716