Open dtaivpp opened 8 months ago
I would like to take this issue
Here is an example of what the generate language pipeline looks like:
GET /docbot/_search
{
"_source": {
"exclude": [
"content_embedding"
]
},
"query": {
"hybrid": {
"queries": [
{
"match": {
"content": {
"query": "How do I enable segment replication"
}
}
},
{
"neural": {
"content_embedding": {
"query_text": "How do I enable segment replication",
"model_id": "Z8VpCYwBKF5Jo_eo10QE",
"k": 5
}
}
}
]
}
},
"ext": {
"generative_qa_parameters": {
"llm_model": "gpt-3.5-turbo",
"llm_question": "How do I enable segment replication",
"conversation_id": "JcVbCYwBKF5Jo_eoe0TD",
"context_size": 3,
"interaction_size": 3,
"timeout": 45
}
}
}
We will need to pass in the model ID, conversation ID, and the question. Then when we are processing the answers this is the response ["ext"]["retrieval_augmented_generation"]["answer"]
Is your feature request related to a problem?
At the moment we have several of the pieces of the RAG pipeline built but now we need to pull it all together. We need to pull this all together now.
What solution would you like?
Our DocBot class will call
docbot.language_model:generate_response
see #80. Generate response will need to query ourcohere-index
and collect the response to return to the user.https://opensearch.org/docs/latest/ml-commons-plugin/conversational-search/#using-the-pipeline
Note* here the
interaction_size
is the number of previous chats to send as context. Thecontext_size
is the number of results from our search that we will send through.The most challenging thing with this PR is we will need to use a neural search in our query section in order to find the most relevant documents. https://opensearch.org/docs/latest/search-plugins/neural-text-search/#step-4-search-the-index-using-neural-search. The model_id that we will need to reference here is the MODEL_ID that is being used by our ingestion pipeline.