opensearch-project / flow-framework

OpenSearch plugin that enables builders to innovate AI apps on OpenSearch
Apache License 2.0
32 stars 35 forks source link

[FEATURE] Pre-defined templates and defaults #496

Closed amitgalitz closed 7 months ago

amitgalitz commented 9 months ago

Is your feature request related to a problem?

Flow-Framework currently supports the creation and provisioning of templates in order to executes a sequence of API calls in OpenSearch. Pasted in the bottom of this issue is an example of a more complicated template where we create a connector, set up a model, a few tools and agents.

As seen this template can be quite long to send to an API and users might not want or need to constantly be altering all the variables in it.

It is popular in CDK repos or other configuration automation platforms to have a simpler mechanisms to easily change just the variables that we want. Another potential improvement that could be added together with this is having predefined templates that are easily used and also ‘fine-grained’ edited by customers.

What solution would you like?

Users should be able to make an API request to create a template by choosing a predefined template and editing only the variables they care about.

http://_plugins/_flow_framework/workflow?use_case=openAI-deploy-model-1
{
  "create_connector_1.credential.openAIkey": "asdfg-1234",
  "create_connector_1.model": "chatgpt-4.0",
  "register_remote_model_1.name": "chatgpt-1"
}

How we will achieve this?

In order to be able to easily substitute any subset of field in the template that we want we will have to maintain two different set of documents. Giving these temporary names for easier discussion:

  1. Substitution Ready Template A set of templates that have all there fields ready to be substituted
  2. Predefined defaults A set of files with predefined defaults for each of the templates (potentially some default files can be used for multiple templates

Example:

Substitution ready default template:

{
  "name": "{template.name}",
  "description": "{template.description}",
  "use_case": "MODEL_DEPLOYMENT_API_KEY",
  "version": {
    "template": "1.0.0",
    "compatibility": [
      "2.12.0",
      "3.0.0"
    ]
  },
  "workflows": {
    "provision": {
      "nodes": [
        {
          "id": "create_connector_1",
          "type": "create_connector",
          "user_inputs": {
            "name": "${create_connector_1}",
            "description": "${create_connector_1.description}",
            "version": "1",
            "protocol": "${create_connector_1.protocol}",
            "parameters": {
              "endpoint": "${create_connector_1.endpoint}",
              "model": "${create_connector_1.model}"
            },
            "credential": {
              "key": "${create_connector_1.credential.key}"
            },
            "actions": [
              {
                "action_type": "predict",
                "method": "POST",
                "url": "https://${parameters.endpoint}/v1/chat/completions"
              }
            ]
          }
        },
        {
          "id": "register_remote_model_1",
          "type": "register_remote_model",
          "previous_node_inputs": {
            "create_connector_1": "parameters"
          },
          "user_inputs": {
            "name": "${register_remote_1_model.name}",
            "function_name": "remote",
            "description": "${register_remote_model_1.description}"
          }
        },
         {
            "id": "deploy_model_1",
            "type": "deploy_model",
            "previous_node_inputs": {
              "register_remote_model_1": "model_id"
            }
          }
      ],
      "edges": [
        {
          "source": "create_connector_1",
          "dest": "register_remote_model_1"
        },
         {
            "source": "register_remote_model_1",
            "dest": "deploy_model_1"
          }
      ]
    }
  }

Default config example

{
  "id": "openAI-deploy-model",
  "matching-template-use-case": "MODEL_DEPLOYMENT_API_KEY",
  "template.name": "deploy-openai-model"
  "create_connector_1.name": "OpenAI-connector",
  "create_connector_1.description": "Connector to public AI model service for GPT 3.5",
  "create_connector_1.protocol": "http",
  "create_connector_1.model": "gpt-3.5-turbo",
  "create_connector_1.endpoint": "api.openai.com",
  "create_connector_1.credential.key": "api.openai.com",
  "register_remote_model_1.name": "test-name",
  "register_remote_model_1.name": "test-descrip",
}

Request from customers:

http://_plugins/_flow_framework/workflow?use_case=openAI-deploy-model-1
{
  "create_connector_1.credential.key": "asdfg-1234",
  "create_connector.model": "gpt-4.0-turbo",
  "register_remote_model.name": "chatgpt-4.0"
}

Changes to create workflow API:

In order to enable this change we will need to add an optional parameter to the create workflow API to distinguish which predefined defaults to use.

Option 1:

http://_plugins/_flow_framework/workflow?use_case=openAI-deploy-model-1
{
  "create_connector_!.credential.key": "asdfg-1234",
  "create_connector.model": "gpt-4.0-turbo",
  "register_remote_model.name": "chatgpt-4.0"
}

Option 2:

http://_plugins/_flow_framework/workflow?use_case=openAI-deploy-model-1,template=MODEL_DEPLOYMENT_API_KEY
{
  "create_connector_1.credential.key": "asdfg-1234",
  "create_connector.model": "gpt-4.0-turbo",
  "register_remote_model.name": "chatgpt-4.0"
}

Storage:

Storage of both the predefined substitution ready templates and the pre-defined templates can be done on a system index. We can also provide users the ability to create there own pre-defined default files that they can easily change. If we store these documents in a system index we can control how users can add additional documents.

Potential new system indices are .plugins-flow-framework-defaults for the default values and .plugins-flow-framework-configurable-templates for the configurable templates.

Another options is to have the configurable templates as part of the template index that would be available to all users, but I don't know if that would create issues in terms of provisioning it. We wont necessarily be provisioning the configurable template themselves but they will be used for copying over to other provisioning and saved templates.

Appendix

Default templates to add:

  1. Create and deploy model a. Cohere embedding b. OpenAI embedding c. Bedrock Titan Embedding d. Bedrock Titan Multimodal Embedding e. Cohere LLM f. OpenAI LLM
  2. Neural Sparse Search with pretrained model
  3. Semantic Search with no model creation
  4. Semantic Search with query enricher and no model
  5. Semantic search with cohere embedding model
  6. Semantic search with cohere embedding model and query enricher
  7. Multimodal Search with no model creation
  8. Multimodal Search with Bedrock Titan model creation
  9. Hybrid Search
  10. Conversational search with cohere model
dbwiddis commented 9 months ago

See https://github.com/opensearch-project/flow-framework/issues/261 for a previous thought I had along these lines, and the wrong way to do it ...

joshpalis commented 8 months ago

I think this is a great idea to improve ease of use, I'm just curious what's the rationale for separating default configs and substitution ready templates? Would it not be simpler to have substitution ready templates (with some default values) in the template index and have the user provision them while supplying the substitution params? This way we wouldnt need to store the default configs in a separate system index.

amitgalitz commented 8 months ago

I think this is a great idea to improve ease of use, I'm just curious what's the rationale for separating default configs and substitution ready templates? Would it not be simpler to have substitution ready templates (with some default values) in the template index and have the user provision them while supplying the substitution params? This way we wouldnt need to store the default configs in a separate system index.

The reason for separating the defaults and substitution ready templates as I feel like they served different purposes. Similar to our new feature of providing parameters in the provision api path in this PR: https://github.com/opensearch-project/flow-framework/pull/525

the defaults are the equivalent of the api path values and the subtitution ready template is the template provided in the create.

I wanted to make everything possibly for substitution but also not require the user to always give all those values each time. We want to potentially have defaults for all the connectors here https://github.com/opensearch-project/ml-commons/tree/main/docs/remote_inference_blueprints. How did you envisioning the configurable template looking like if its merged with the defaults?

dbwiddis commented 8 months ago

In order to be able to easily substitute any subset of field in the template that we want we will have to maintain two different set of documents. Giving these temporary names for easier discussion:

  1. Substitution Ready Template A set of templates that have all there fields ready to be substituted
  2. Predefined defaults A set of files with predefined defaults for each of the templates (potentially some default files can be used for multiple templates

I don't think we need two sets of documents: that introduces a lot of complexity, and separates the substitution name (in a template) from its default somewhere else. This:

We're basically substituting params at runtime (we are literally calling the map params).

We already have a section of the template (for any workflow) of user_params. Why not either:

  1. Create a new similar section default_params for the substitutions in that workflow. It's right there, next to the workflow, for easy cross reference.
  2. Use the existing user_params field as the defaults, with clear documentation that they can be overridden on the REST path/body.
dbwiddis commented 8 months ago

"name": "${create_connector_1}",

FYI, I've already implemented these substitutions with double curly braces, e.g., "name": "${{create_connector_1}}",.

So the work to implement this is really: