[BUG] Error deploying model, unknown built-in op

taborzbislaw commented 1 month ago

I have gone through the example: opensearch-py-ml/examples/demo_deploy_cliptextmodel.html

Model is correctly registered in opensearch cluster but the final command of the example: ml_client.deploy_model(model_id) ends with an error Exception: Model deployment failed

Afrer trying to deploy the model using console directly on the cluster i.e.: POST /_plugins/_ml/models//_deploy

and checking the corresponding task output: GET /_plugins/_ml/tasks/

I see that deployment fails with the following error returned by the task: { "model_id": "uDNmgZIBC9ZdJM8aMbns", "task_type": "DEPLOY_MODEL", "function_name": "TEXT_EMBEDDING", "state": "FAILED", "worker_node": [ "xbE0btjVQUGBAWrVqCmUoQ" ], "create_time": 1728748205169, "last_update_time": 1728748210632, "error": """{"xbE0btjVQUGBAWrVqCmUoQ":"\nUnknown builtin op: aten::scaled_dot_product_attention.\nHere are some suggestions: \n\taten::_scaled_dot_product_attention\n\nThe original call is:\n File \"code/torch/transformers/models/clip/modeling_clip.py\", line 190\n key_states = torch.transpose(torch.view(_54, [_49, -1, 8, 64]), 1, 2)\n value_states = torch.transpose(torch.view(_55, [_48, -1, 8, 64]), 1, 2)\n attn_output = torch.scaled_dot_product_attention(query_states, key_states, value_states, attn_mask, 0., False, scale=0.125)\n ~~~~~~~~~~ <--- HERE\n attn_output0 = torch.transpose(attn_output, 1, 2)\n input = torch.reshape(attn_output0, [_47, _51, _52])\n"}""", "is_async": true }

taborzbislaw commented 1 month ago

The same error is raised when trying to deploy other transformer models e.g.: model_name = "bert-base-uncased" text_to_encode = "example search query" model = BertModel.from_pretrained(model_name, torchscript=True, return_dict=False) processor = BertTokenizer.from_pretrained(model_name)

dblock commented 2 weeks ago

Looks like a server-side problem, moving to ml-commons.

opensearch-project / ml-commons

[BUG] Error deploying model, unknown built-in op #3199