ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related features within OpenSearch.
Apache License 2.0
91
stars
129
forks
source link
[BUG] Registering Pretrained Opensearch Model Fails Due to Null Model Config #2981
What is the bug?
A clear and concise description of the bug.
On Opensearch 2.11.0, attempting to register Opensearch Pretrained models like amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill fails with the below error:
POST /_plugins/_ml/models/_register?deploy=true
{
"name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill",
"function_name": "SPARSE_TOKENIZE",
"version": "1.0.0",
"model_format": "TORCH_SCRIPT",
"model_content_hash_value": "86bab435d031edb2a6d921fd9ac317a7541d5d95666f642b606e7d0ebfb84358",
"description": "This is a neural sparse encoding model: It transfers text into sparse vector, and then extract nonzero index and value to entry and weights. It serves only in ingestion and customer should use tokenizer model in query."
}
model config is null only appears in two places in the codebase, but I suspect it's coming from this class. However, I don't think the above requests matches the conditions for this error - which is what leads me to believe this might be a bug.
Can you confirm? And if yes, maybe you can provide some context on how it should work and I'm happy to submit a fix.
How can one reproduce the bug?
Steps to reproduce the behavior:
Provision a new Opensearch 2.11 server, this bug was verified by myself on both the docker image & a managed Opensearch cluster provided by AWS
Make the following request to register a pretrained OS model (which should not require a URL as per docs):
POST /_plugins/_ml/models/_register?deploy=true
{
"name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill",
"function_name": "SPARSE_TOKENIZE",
"version": "1.0.0",
"model_format": "TORCH_SCRIPT",
"model_content_hash_value": "86bab435d031edb2a6d921fd9ac317a7541d5d95666f642b606e7d0ebfb84358",
"description": "This is a neural sparse encoding model: It transfers text into sparse vector, and then extract nonzero index and value to entry and weights. It serves only in ingestion and customer should use tokenizer model in query."
}
Get the task ID from the response in Step 3, and check it's status using `GET /_plugins/_ml/tasks/:task_id
Observe the error which states model config is null
What is the expected behavior?
The requested model should register and deploy successfully.
What is your host/environment?
OS: AWS Managed Opensearch Cluster 2.11
Version - 2.11
Plugins N/A
Do you have any screenshots?
N/A
Do you have any additional context?
Largely been following this tutorial provided by the Opensearch Docs.
What is the bug? A clear and concise description of the bug.
On Opensearch 2.11.0, attempting to register Opensearch Pretrained models like
amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill
fails with the below error:The request used is:
model config is null
only appears in two places in the codebase, but I suspect it's coming from this class. However, I don't think the above requests matches the conditions for this error - which is what leads me to believe this might be a bug.Can you confirm? And if yes, maybe you can provide some context on how it should work and I'm happy to submit a fix.
How can one reproduce the bug? Steps to reproduce the behavior:
Provision a new Opensearch
2.11
server, this bug was verified by myself on both the docker image & a managed Opensearch cluster provided by AWSConfigure the ML Plugin using the following:
Make the following request to register a pretrained OS model (which should not require a URL as per docs):
Get the task ID from the response in Step 3, and check it's status using `GET /_plugins/_ml/tasks/:task_id
Observe the error which states
model config is null
What is the expected behavior? The requested model should register and deploy successfully.
What is your host/environment?
Do you have any screenshots? N/A
Do you have any additional context? Largely been following this tutorial provided by the Opensearch Docs.