opensearch-project / documentation-website

The documentation for OpenSearch, OpenSearch Dashboards, and their associated plugins.
https://opensearch.org/docs
Apache License 2.0
64 stars 434 forks source link

[DOC] Update the Conversational Search page to show new parameters being added in 2.11 #5125

Closed austintlee closed 9 months ago

austintlee commented 9 months ago

What do you want to do?

Tell us about your request. Provide a summary of the request and all versions that are affected.

Change 1: Setting up the pipeline

PUT /_search/pipeline/<pipeline_name>
{
  "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "openai_pipeline_demo",
        "description": "Demo pipeline Using OpenAI Connector",
        "model_id": "<model_id>",
        "context_field_list": ["text"]
      }
    }
  ]
}

to

PUT /_search/pipeline/<pipeline_name>
{
  "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "openai_pipeline_demo",
        "description": "Demo pipeline Using OpenAI Connector",
        "model_id": "<model_id>",
        "context_field_list": ["text"],
        "system_prompt": "You are a helpful assistance",
        "user_instructions": ""Generate a concise and informative answer in less than 100 words for the given question"
      }
    }
  ]
}

New parameters

system_prompt: this is a message sent to LLMs (e.g. OpenAI) as a "system" role.

user_instructions: this is an additional message sent to LLMs as a "user" role. It is not uncommon for user instructions to be sent as a system prompt. This customization allows for experimentation to play with prompts.

Change 2: Using the pipeline

GET /<index_name>/_search?search_pipeline=<pipeline_name>
{
    "query" : {...},
    "ext": {
        "generative_qa_parameters": {
            "llm_model": "gpt-3.5-turbo",
            "llm_question": "Was Abraham Lincoln a good politician",
            "conversation_id": "_ikaSooBHvd8_FqDUOjZ"
        }
    }
}

to

GET /<index_name>/_search?search_pipeline=<pipeline_name>
{
    "query" : {...},
    "ext": {
        "generative_qa_parameters": {
            "llm_model": "gpt-3.5-turbo",
            "llm_question": "Was Abraham Lincoln a good politician",
            "conversation_id": "_ikaSooBHvd8_FqDUOjZ",
                         "context_size": 5,
                         "interaction_size": 5,
                         "timeout": 15
        }
    }
}

New parameters

context_size: this is the number of search results sent to LLMs. This is typically needed to meet the token size limit which can vary model to model, e.g. 4k characters. Alternatively, you can use the size search request parameter to control the amount of information sent to LLMs.

interaction_size: this is the number of interactions (questions and answers) sent to LLMs. As with the number of search results, this can affect the total number of tokens seen by the LLM. If this is not set, the default interaction size of 10 is used.

timeout: this is the number of seconds the RAG pipeline waits for the remote model (via the connector) to respond. The default timeout is currently 30 seconds.

What other resources are available? Provide links to related issues, POCs, steps for testing, etc.

austintlee commented 9 months ago

https://github.com/opensearch-project/ml-commons/issues/1372

austintlee commented 9 months ago

All of the parameters mentioned above are optional.

austintlee commented 9 months ago

cc @ylwu-amzn

austintlee commented 9 months ago

We would like to show a disclaimer that this has only been tested with OpenAI and Bedrock Anthropic Claude models. Can we still add this for the 2.11 release?