Error in azureml/evaluate/mlflow/hftransformers/_task_based_predictors.py - from transformers import pipeline, Conversation

Operating System

Linux

Version Information

Error when deploying a packaged MLFlow model from HF [huggingface.co/meta-llama/Meta-Llama-Guard-2-8B, huggingface.co/Qwen/Qwen1.5-4B] to Managed online endpoint, error seems to be occurring in mlflow scoring script, in this file: azureml/evaluate/mlflow/hftransformers/_task_based_predictors.py

Steps to reproduce

Download HF model - meta-llama/Meta-Llama-Guard-2-8B
Convert HF model to MLFlow - using aml component "convert_to_mlflow_model" https://ml.azure.com/registries/azureml/components/convert_model_to_mlflow
Package MLFlow model using python sdk v2 (modelpackage).
Deploy packaged model env to Managed Online Endpoint.

Expected behavior

Deployed MOE is successful, MOE should respond to JSON request with generated text.

Actual behavior

Error: 2024-07-18 18:08:37,534 E [89] azmlinfsrv - Encountered Exception: Traceback (most recent call last): File "/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 132, in invoke_run run_output = self._wrapped_user_run(run_parameters, request_headers=dict(request.headers)) File "/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 156, in self._wrapped_user_run = lambda request_headers, kwargs: self._user_run(kwargs) File "/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/inference_schema/schema_decorators.py", line 68, in decorator_input return user_run(*args, *kwargs) File "/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/inference_schema/schema_decorators.py", line 68, in decorator_input return user_run(args, kwargs) File "/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/inference_schema/schema_decorators.py", line 96, in decorator_input return user_run(*args, **kwargs) File "/var/mlflow_resources/mlflow_score_script.py", line 448, in run result = model.predict(input, params=params) File "/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/mlflow/pyfunc/init.py", line 719, in predict return self._predict_fn(data) File "/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/azureml/evaluate/mlflow/hftransformers/init.py", line 1002, in predict from azureml.evaluate.mlflow.hftransformers._task_based_predictors import get_predictor File "/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/azureml/evaluate/mlflow/hftransformers/_task_based_predictors.py", line 18, in from transformers import pipeline, Conversation ImportError: cannot import name 'Conversation' from 'transformers' (/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/transformers/init.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/azureml_inference_server_http/server/routes.py", line 222, in handle_score timed_result = main_blueprint.user_script.invoke_run(request, timeout_ms=config.scoring_timeout) File "/opt/miniconda/envs/azureml_env94d7f4560f6648c58f9156957403aeee/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 139, in invoke_run raise UserScriptException(ex) from ex azureml_inference_server_http.server.user_script.UserScriptException: Caught an unhandled exception from the user script

Addition information

No response

Azure / azureml-examples