microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.26k stars 3.12k forks source link

Introduce Workflow Knowledge to Plugins #7333

Open waghmare-omkar opened 1 month ago

waghmare-omkar commented 1 month ago

Today, plugins are collections of functions that perform atomic actions. This design is beneficial as it exposes the Large Language Model (LLM) to a variety of specific actions it can perform. However, when these atomic actions are highly domain-specific, the LLM may not be fully aware of the typical sequence in which these actions should be executed.

Example Scenario

Consider a plugin designed to manage database entries with the following functions:

  1. Add_Entry
  2. Update_Entry
  3. Delete_Entry
  4. Open_Transaction
  5. Close_Transaction

To add an entry to a database, the typical sequence of functions would be:

  1. Open_Transaction
  2. Add_Entry
  3. Close_Transaction

Currently, if you ask the LLM to add an entry to the database, it might directly call the Add_Entry function without executing the necessary Open_Transaction and Close_Transaction steps. This indicates that the LLM lacks awareness of the domain-specific workflow that should be followed when interacting with a database.

Proposed Solution

From a plugin authoring perspective, there should be a way for plugin authors to specify workflows that involve the functions they have defined. This would enable the LLM to understand and execute the correct sequence of actions for domain-specific tasks.

Benefits

Notes

While the database example provided could be implemented such that the Add_Entry function internally handles transaction management, the larger point is about the concept of "workflows." This feature would be beneficial across different domains where specific sequences of actions are necessary.

By introducing workflow knowledge to plugins, we can significantly enhance the capability and usability of LLMs in performing complex, domain-specific tasks.

moonbox3 commented 1 month ago

Hi @waghmare-omkar ,

Thank you for your suggestion.

Currently, the models we work with do not support a built-in "workflow" component. When we send a JSON payload to the model, it includes model settings, key-value pair (role-content) messages, and a tool choice/tools.

To guide the model in following a specific sequence of actions, like the workflow you mentioned, we typically rely on prompt engineering. This involves crafting a specific system message that outlines the desired workflow. Additionally, one may add a more detailed description to the kernel function, which is included as part of the tools JSON sent to the model.

Your example scenario can be addressed by including the workflow in the system message. For instance, the request payload to the model might look like this:

{
    "model": "gpt-4",
    "max_tokens": 2000,
    "stream": true,
    "temperature": 0.7,
    "top_p": 0.8,
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "database-open_transaction",
                "description": "This function opens a database transaction. It should be called before adding or updating entries.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "value1": {
                            "type": "integer",
                            "description": "The first number to add"
                        },
                        "value2": {
                            "type": "integer",
                            "description": "The second number to add"
                        }
                    },
                    "required": ["value1", "value2"]
                }
            }
        }
    ],
    "tool_choice": "auto",
    "messages": [
        {
            "role": "system",
            "content": "To add an entry to a database, use the functions in this order: Open_Transaction, Add_Entry, and then Close_Transaction."
        },
        {
            "role": "user",
            "content": "Add this information to the database..."
        }
    ]
}

This way, the model is guided to follow the correct sequence of actions. We can enhance this approach by incorporating detailed descriptions and specifying the order of operations in the system message. This is only an example and can be tailored to fit your specific application.

matthewbolanos commented 1 month ago

We just discussed this internally, and we realize that there is likely a need for a plugin author to generate a system prompt that can then be injected into an agent. We'll be doing some design activities on our side to figure out the best way of solving this.

We believe that this might be better solved with multi-agent patterns, so we're going to tag it as such and follow back up once we've validated our hypothesis.

waghmare-omkar commented 1 month ago

Thanks. That sounds good. If you would like, I can also walk through our use case to provide another perspective. I am also interested in the multi-approach. Thanks for not considering this.

Regards, Omkar

On Fri, Jul 19, 2024 at 11:47 AM Matthew Bolaños @.***> wrote:

We just discussed this internally, and we realize that there is likely a need for a plugin author to generate a system prompt that can then be injected into an agent. We'll be doing some design activities on our side to figure out the best way of solving this.

We believe that this might be better solved with multi-agent patterns, so we're going to tag it as such and follow back up once we've validated our hypothesis.

— Reply to this email directly, view it on GitHub https://github.com/microsoft/semantic-kernel/issues/7333#issuecomment-2239492628, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI6FSMUXK2AQB75CUKNFSFTZNEYJPAVCNFSM6AAAAABLBCSH56VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZZGQ4TENRSHA . You are receiving this because you were mentioned.Message ID: @.***>

waghmare-omkar commented 1 month ago

Oh boy, autocorrect messed up the reply.

I meant, Thanks for considering this feature request for design discussion!

Haha. :)

Regards, Omkar

On Fri, Jul 19, 2024 at 1:29 PM Omkar Waghmare @.***> wrote:

Thanks. That sounds good. If you would like, I can also walk through our use case to provide another perspective. I am also interested in the multi-approach. Thanks for not considering this.

Regards, Omkar

On Fri, Jul 19, 2024 at 11:47 AM Matthew Bolaños @.***> wrote:

We just discussed this internally, and we realize that there is likely a need for a plugin author to generate a system prompt that can then be injected into an agent. We'll be doing some design activities on our side to figure out the best way of solving this.

We believe that this might be better solved with multi-agent patterns, so we're going to tag it as such and follow back up once we've validated our hypothesis.

— Reply to this email directly, view it on GitHub https://github.com/microsoft/semantic-kernel/issues/7333#issuecomment-2239492628, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI6FSMUXK2AQB75CUKNFSFTZNEYJPAVCNFSM6AAAAABLBCSH56VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZZGQ4TENRSHA . You are receiving this because you were mentioned.Message ID: @.***>