Closed dtaivpp closed 1 year ago
I tested this on 2.10 , it works
POST /_plugins/_ml/model_groups/_register
{
"name": "my_remote_model_group_cohere",
"description": "This is a test group"
}
response
{
"model_group_id": "wySNm4oBRiMywALe-EvK",
"status": "CREATED"
}
POST /_plugins/_ml/connectors/_create
{
"name": "Cohere enbmedding",
"description": "my test connector",
"version": "1.0",
"protocol": "http",
"credential": {
"cohere_key": "<your_cohere_key>"
},
"parameters": {
"model": "embed-english-v2.0",
"truncate": "END"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://api.cohere.ai/v1/embed",
"headers": {
"Authorization": "Bearer ${credential.cohere_key}"
},
"request_body": "{ \"texts\": ${parameters.prompt}, \"truncate\": \"${parameters.truncate}\", \"model\": \"${parameters.model}\" }",
"pre_process_function": "connector.pre_process.cohere.embedding",
"post_process_function": "connector.post_process.cohere.embedding"
}
]
}
The key part which is not in the blueprint
"pre_process_function": "connector.pre_process.cohere.embedding",
"post_process_function": "connector.post_process.cohere.embedding"
These two process functions are mandatory if you want to let the Cohere model to work with neural-search. Because the remote model input/output varies a lot, we need these two process functions to transform the input to fit into remote model and transform the output to fit into neural-search. Find more details in this doc
Response
{
"connector_id": "2SSRm4oBRiMywALecEuh"
}
POST /_plugins/_ml/models/_register?deploy=true
{
"name": "Cohere embedding model",
"function_name": "remote",
"model_group_id": "wySNm4oBRiMywALe-EvK",
"description": "test model",
"connector_id": "2SSRm4oBRiMywALecEuh"
}
Response
{
"task_id": "4CSSm4oBRiMywALeRktD",
"status": "CREATED"
}
Then use get task API to find model id
GET /_plugins/_ml/tasks/4CSSm4oBRiMywALeRktD
response
{
"model_id": "4SSSm4oBRiMywALeRktd",
"task_type": "REGISTER_MODEL",
"function_name": "REMOTE",
"state": "COMPLETED",
"worker_node": [
"lkN3LiY3SfmR6DRUO7SR3Q"
],
"create_time": 1694827169347,
"last_update_time": 1694827169384,
"is_async": false
}
POST /_plugins/_ml/models/4SSSm4oBRiMywALeRktd/_predict
{
"parameters": {
"texts": ["Say this is a test"]
}
}
Response
{
"inference_results": [
{
"output": [
{
"name": "sentence_embedding",
"data_type": "FLOAT32",
"shape": [
4096
],
"data": [
-0.77246094,
-0.12927246,
-0.52490234,
...
]
}
]
}
]
}
@ylwu-amzn okay this is working in 2.9 with the predict endpoint. Seems there is still an issue with ingestion as I am getting the following when using _bulk with the ingestion pipeline. Does the text_embedding
step of the ingestion pipeline get created as a part of ML Commons or does that live elsewhere?
{
"create": {
"_index": "cohere-index",
"_id": "1",
"status": 400,
"error": {
"type": "illegal_argument_exception",
"reason": "Invalid JSON in payload"
}
}
}
Ingestion pipeline is in neural-search plugin. Can you share your knn index and ingestion pipeline setting? Need to reproduce this issue to debug
Thanks for the debugging @ylwu-amzn! For future people who find this thread the issues have been resolved in the blueprint. There was a parameter that should have $parameters.texts but was incorrectly put into the java code as $parameters.prompt. I've updated the blueprint to reflect $parameters.prompt until the java code can be fixed.
The second issue was with the pre/post process functions that were missing in the templates. These have been added in PR #1351.
What is the bug? When ingesting data using Cohere's blueprint and neural search the ingestion pipeline returns:
How can one reproduce the bug? Steps to reproduce the behavior: DevTools reproduction:
What is the expected behavior? Expect data to be ingested.
What is your host/environment?