opensearch-project / terraform-provider-opensearch

https://registry.terraform.io/providers/opensearch-project/opensearch
Apache License 2.0
74 stars 58 forks source link

[FEATURE] OpenSearch Plugin Connectors and other commands #178

Open uriofferup opened 6 months ago

uriofferup commented 6 months ago

Is your feature request related to a problem?

I'm trying to automate deployments of OpenSearch with its associated applications. In the past I tried using awscurl to create indices but that has its limitations. Using the OpenSearch Terraform Provider helped, but I still found some other required initial configurations that are hard to automate.

What solution would you like?

I'm looking to create something like an opensearch_execute_command that helps simplify many configurations as an initial step before creating specialized resources that correctly manage the lifecycle. This can be set in a resource as something like:

resource "opensearch_execute_command" "connector"
{
    method: "POST"
    endpoint: "/_plugins/_ml/connectors/_create"
    body: <<EOF
    {
        "name": "OpenAI Chat Connector",
        "description": "The connector to public OpenAI model service for GPT 3.5",
        "version": 1,
        "protocol": "http",
        "parameters": {
            "endpoint": "api.openai.com",
            "model": "gpt-3.5-turbo"
        },
        "credential": {
            "openAI_key": "..."
        },
        "actions": [
            {
                "action_type": "predict",
                "method": "POST",
                "url": "https://${parameters.endpoint}/v1/chat/completions",
                "headers": {
                    "Authorization": "Bearer ${credential.openAI_key}"
                },
                "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
            }
        ]
    }
    EOF
}

Then the resource could expose a result with the response body.

What alternatives have you considered?

Using a null resource with awscurl but that has its challenges with authentication and parsing the results.

Do you have any additional context?

For example following the documentation for implementing external ML Models, its required to send something like:

POST /_plugins/_ml/connectors/_create
{
    "name": "OpenAI Chat Connector",
    "description": "The connector to public OpenAI model service for GPT 3.5",
    "version": 1,
    "protocol": "http",
    "parameters": {
        "endpoint": "api.openai.com",
        "model": "gpt-3.5-turbo"
    },
    "credential": {
        "openAI_key": "..."
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "url": "https://${parameters.endpoint}/v1/chat/completions",
            "headers": {
                "Authorization": "Bearer ${credential.openAI_key}"
            },
            "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
        }
    ]
}
rblcoder commented 5 months ago

@uriofferup Are you willing to try https://github.com/opensearch-project/opensearch-go/blob/main/guides/json.md

prudhvigodithi commented 5 months ago

Hey @uriofferup thanks for your recommendation, I had an offline sync up with @rblcoder, the terraform provider should be handling the CRUD operations, having to give users directly execute the http calls by declaring the method option would defeat the purpose of provider doing the CRUD operations. Also there would arise some inconsistencies in the state store from what user has passed via method option and what the terraform is willing to perform based on the state store, example user might pass POST, but based on the state store terraform might interact with client to execute a PUT operation. I would recommend to go with creation of right resource for each specific use case and allow the terraform provider to do the heavy lifting. In this case a new resource has to be created for handling ml connectors (example https://registry.terraform.io/providers/opensearch-project/opensearch/latest/docs/resources/data_stream, this resource handles the data stream setup)

Also just curious is this execute_command been added to any other well known terraform providers, happy to explore and integrate it in a standard and best practices manner.

Thanks @bbarani

pberger514 commented 3 months ago

Hi there, I just realized my issue https://github.com/opensearch-project/terraform-provider-opensearch/issues/203 was a duplicate of this, so closed.

One thing I will add is that registering the connector as a model needs to also be supported. This is because downstream tasks that are supported in tf will need to point to the registered model.

I'm looking to setup a full neural search pipeline in tf, and I believe these are the last necessary components, so would appreciate if validation could include that as a requirement.

Also, curious if this feature is on the roadmap?