jngz-es commented 1 year ago

RFC - Conversation Plugin in OpenSearch

Overview

The conversation plugin is to support conversation-based applications. It is designed to provide simple conversation APIs which wrap up the interactions with ml-commons (llm support) for application developer to use, that means applications only need to use these APIs to have a conversation after a model registration with ml-commons plugin.

We will have another RFC for ml-commons ReAct support.

Components

Conversation plugin

Conversation Management - manage chat sessions. Store history messages per session and all sessions’ meta data.
Index - we use OpenSearch index as data store to provide persistent memory functionality.

ML-commons plugin

Models Management - The ml-commons plugin manages all model lifecycle, provides like predict/get APIs etc.
Tools Management - The ml-commons plugin manages all built-in tools, such as a set of OpenSearch tool, like search tool, AD tool, alerting tool, ISM tool etc. We can also let users build their tools like queries to their private knowledge bases, external metrics, etc.
Prompt Template Management - Provide some built-in prompt templates for users to generate prompts. We can also let users register their own templates to use.
Embeddings - generate vector for messages through embeddings models hosted in ml-commons.
VectorDB - provide some simple vector db functions by leveraging knn index.
Text splitter - split long document to segments fitting to the input of llm. Consider making the splitting more meaningful, like section by section to have each piece meaningful.
ReAct Agent - ml-commons also takes ReAct steps to run tools and interact with llm.

Architecture

conversation-rfc-0(1)

Workflow

Chat

chat-0(1)

The chatbot frontend register their own models through ml-commons API. It contains llm meta, prompt template and parameters, tools.
After getting user message, the frontend run _chat API of conversation plugin with session id and model id. The model id is required, the conversation plugin doesn’t store any default model id for the session. If no session id in the request, the conversation plugin will create a new session for this first user message that will be store as “title” for this new session.
The conversation plugin prepares the session history and combines it into the predict request to ml-commons.
The ml-commons plugin executes a ReAct process which interact with llm, then return the ReAct result.
The conversation plugin stores the pair of use/llm messages into the session, then return llm message.
The frontend gets the llm message and return it to the user.

Session Meta data

{
     "user_id": "user-0",
    "model_id": "model-0",
    "title": "the first user message",
    "created_time": timestamp
}

Message data

{
     "user_id": "user-0",
    "session_id": "session-0",
    "question": "",
    "answer": "",
    "created_time": timestamp
}

APIs

Chat

POST /_plugins/_conversation/_chat
{
     "session_id": "session-0",
    "model_id": "model-0",
    "parameters": {
        "context": "local knowledge, examples, specific plugin description etc..."
        "question": "Hello OpenSearch",
        "tools": ["MathTool", "SearchIndexTool", "SearchPipelineTool"],
        "verbose": true
    }
}

response
{
    "session_id": "session-0",
    "answer": "llm output"
}

Get session history

GET /_plugins/_conversation/history?sessionId={string}&pageSize={integer}&currentPage={integer}

//By default sorted by created_time ascendingly
response
{
     "session_id": "2dcae340-534a-457d-a9eb-8b8ee963ed9c",
    "steps": [
        {
            "question": "",
            "answer": "",
            "created_time": 1686632250529
        },
        {
            "question": "",
            "answer": "",
            "created_time": 1686632275116
        },
        ...
    ]
}

Get sessions

GET /_plugins/_conversation/sessions?pageSize={integer}&currentPage={integer}

//By default sorted by created_time ascendingly
response
{
    "sessions": [
        {
            "session_id": "2dcae340-534a-457d-a9eb-8b8ee963ed9c",
            "title": "",
            "created_time": 1686632250529
        },
        {
            "session_id": "502919cf-d083-4d63-8a33-6047755e5a0a",
            "title": "",
            "created_time": 1686632275116
        },
        ...
    ]
}

Ml-commons APIs

Register model

POST /_plugins/_ml/models/_register?deploy=true
{
    "name": "openAI model: chat ReAct",
    "function_name": "remote",
    "version": "1.0.0",
    "description": "test model",
    "connector": {
        "http/v1": {
            "credential": {
                "api_key": "{{api_key}}"
            },
            "parameters": {
                "model": "gpt-3.5-turbo",
                "temperature": 0.1,
                "response_filter": "$.choices[0].message.content"
            },
            "headers": {
                "Authorization": "Bearer ${credential.api_key}"
            },
            "endpoint": "https://api.openai.com/v1/chat/completions",
            "http_method": "post",
            "body_template": "{ \"model\": \"${parameters.model}\",  \"messages\": [ { \"role\": \"user\",\"content\": \"${parameters.prompt}\" } ],   \"temperature\": ${parameters.temperature}, \"stop\": ${parameters.stop}}"
        }
    },
    "tools": ["MathTool", "SearchIndexTool", "SearchPipelineTool"],
}

# response
{
    "task_id": "G7uZ_YcBwpRELbgnHdG6",
    "model_id": "HLuZ_YcBwpRELbgnHdHa",
    "status": "CREATED"
}

Predict

POST /_plugins/_ml/models/model-id/_predict
{
    "parameters": {
        "examples": ["example 1", "example 2"],
        "question": "question",
        "verbose": true
    }
}

Get tools list

GET /_plugins/_ml/tools

response
{
    "tools": ["MathTool", "SearchIndexTool", "SearchPipelineTool"]
}

Get tool

GET /_plugins/_ml/tools/MathTool

response
{
    "name": "MathTool",
    "description": "It is a tool to calculate math ......"
}

Appendix

No-Code GenAI Application Creation Support

nocode-1(1)

The Application Builder UI It is designed as a drag-and-drop application workflow builder. The visual component operations are converted to REST API calls to the different backend entities through the Application Builder Backend plugin to create the corresponding function object.

The Application Builder Backend It is responsible for creating backend components and the whole workflow by interacting with different backend entities. After the workflow setup successfully, it returns an entry point of the application to the UI for users to interact with the application.

macohen commented 1 year ago

@jngz-es how do you see search pipelines involved here? I think there's room and it's in the diagram, but it's not clear what the integration looks like. Suggestion: think about each plugin, especially ml-commons and see if there are elements that can be extracted into a separate component that can be re-usable for other purposes. Just based on the description and not knowing any more , I could see Text Splitter being a good possibility. What do you think?

jonfritz commented 1 year ago

Posting on this doc as well. I noticed this RFC is quite similar to one that’s already posted on this topic (#1150). I love the excitement around these ideas! However, I'm concerned that the overlap in RFCs will cause confusion in the community and make it difficult to align our development.

We would love to find a process where we can work together. The process that I’m used to in open source communities is to start with one RFC and then iterate and add feedback rather than creating multiple RFCs. This process has some benefits - it drives alignment in the open, enables the community to share and iterate on ideas, and makes the end product easy to understand and use.

My suggestion is that we adopt this approach to work together on the RFC for conversational features in OpenSearch. We greatly appreciate the feedback you've already given to the original RFC (#1150), and we'd be happy to do the work to update this RFC and continue to iterate to incorporate any other technical suggestions you have. Let us know what you think! We are excited to find ways to work together to make OpenSearch the best platform for building conversational applications.

jngz-es commented 1 year ago

@macohen Hi Mark, search pipeline can be a tool similar with search tool managed by ml-commons.

jonfritz commented 1 year ago

@jngz-es from the call today, I think we have a path forward on how we can avoid overlap and duplicate efforts using a typical open source process. 1/RFC #1150 will describe the implementation for the OpenSearch community for conversational search (both single interaction and multi-interaction "chat"), conversational memory, and the APIs associated with these functions. The spirit of this RFC is to enable natural language (using LLMs) interactions on data stored in OpenSearch. If you have suggestions on a different architecture or approach (e.g. ideas you've proposed in this separate RFC), please add that feedback to #1150 and as a community, we can debate and decide on the best approach. Once that's been decided, we'd love the community to help build this feature set to the implementation decided on in that RFC. 2/RFC #1151 should target a different theme than existing RFCs to avoid customer and community confusion, and from the discussion so far, it seems the big difference is the intention to enable building conversational agents that do not interact with data in OpenSearch (conversely, #1150 is focused on chat interfaces for data in OpenSearch). Can we change the title of #1151 to reflect this (e.g. "Multi-agent conversational platform in OpenSearch" or something that makes it clear what is being proposed that is net new to what is already being discussed in #1150 ) and explore what these use cases are, and why OpenSearch should be a platform for customers to build these applications? This would dive deeper into the net new functionality proposed and avoid overlap with the community efforts to build conversational search in #1150. It will also help the community give feedback on the approach and where this code should live. For me, the main difference between the two RFCs are: #1150 deals with conversational search with data stored in OpenSearch (e.g. RAG) and #1151 is for conversational agents that do not interact with data in OpenSearch (more similar to LangChain's general feature set). Super critical that we follow crisp open source development practices, and I appreciate working together to make sure the community can crisply focus and develop new features together.

ylwu-amzn commented 1 year ago

This RFC suggests a new general conversation plugin which will be on top of ml-commons Agent Framework (#1161). #1150 mentioned the similar thing of conversation API. And #1161 is the Agent Framework RFC. Will resolve this RFC for now. Let's continue conversation API/feature topic on #1150 and Agent Framework topic on #1161

opensearch-project / ml-commons