The credentials from the databricks profiles are not passed along in all subfunctions of the copy function from mlflow_export_import.copy.copy_model_version. The error occurs only outside of databricks, e.g. in a standard python environment.
Probably related to #174, #160
Set databricks profiles in ~/.databrickscfg for dev-env and prod-env
Execute code in a standard python environment (latest mlflow_export_import version)
Expected behavior
The authentication happens via the provided profiles and the model is copied.
Error message
No response
01-Mar-24 14:18:34 - INFO - Using default logging config without output log file
01-Mar-24 14:18:35 - INFO - Copying model version 'dev_machinelearning.alf.alf_lightgbm/2' to 'prod_bronzeinternal.grid.alf_lightgbm_prod'
MlflowClient SRC:
client.tracking_uri: databricks://dev-env
client._registry_uri: databricks-uc://dev-env
Credentials
host: https://xxx.azuredatabricks.net/
username: None
password: None
token: xxx
aws_sigv4: False
auth: None
ignore_tls_verification: None
client_cert_path: None
server_cert_path: None
mlflow fluent:
mlflow.tracking_uri: file:///workspaces/dataplatform/mlruns
mlflow.registry_uri: file:///workspaces/dataplatform/mlruns
MlflowClient DST:
client.tracking_uri: databricks://prod-env
client._registry_uri: databricks-uc://prod-env
Credentials
host: https://xxx.azuredatabricks.net/
username: None
password: None
token: xxx
aws_sigv4: False
auth: None
ignore_tls_verification: None
client_cert_path: None
server_cert_path: None
mlflow fluent:
mlflow.tracking_uri: file:///workspaces/dataplatform/mlruns
mlflow.registry_uri: file:///workspaces/dataplatform/mlruns
Traceback (most recent call last):
File "/workspaces/dataplatform/.github/workflows/scripts/export_model.py", line 27, in
src_model_version, dst_model_version = copy(
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/copy/copy_model_version.py", line 70, in copy
_create_registered_model(src_client, src_model_name, dst_client, dst_model_name, copy_permissions)
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/copy/copy_model_version.py", line 87, in _create_registered_model
if not utils.calling_databricks() or not copy_permissions:
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/common/utils.py", line 24, in calling_databricks
dbx_client = dbx_client or DatabricksHttpClient()
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/client/http_client.py", line 204, in init
super().init("api/2.0", host, token)
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/client/http_client.py", line 89, in init
(host, token) = mlflow_auth_utils.get_mlflow_host_token()
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/client/mlflow_auth_utils.py", line 24, in get_mlflow_host_token
_raise_exception(uri)
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/client/mlflow_auth_utils.py", line 42, in _raise_exception
raise MlflowExportImportException(
mlflow_export_import.common.MlflowExportImportException: {"message": "MLflow tracking URI (MLFLOW_TRACKING_URI environment variable) must be an HTTP URI: 'file:///workspaces/dataplatform/mlruns'.", "http_status_code": 401}
Proposed fix
In commen.mlflow_utils in set_experiment change to
"""
Set experiment name.
For Databricks, create the workspace directory if it doesn't exist.
:return: Experiment
"""
if utils.calling_databricks(dbx_client=dbx_client):
create_workspace_dir(dbx_client, os.path.dirname(exp_name))
try:
....
passing the dbx_client
and also change in copy.copy_model_version in _create_registered_model
model_exists = copy_utils.create_registered_model(dst_client, dst_model_name)
if not copy_permissions:
return
...
removing the calling_databricks. Alternative the client needs to be passed here for authentication.
Description:
The credentials from the databricks profiles are not passed along in all subfunctions of the copy function from mlflow_export_import.copy.copy_model_version. The error occurs only outside of databricks, e.g. in a standard python environment. Probably related to #174, #160
To Reproduce
Expected behavior
The authentication happens via the provided profiles and the model is copied.
Error message
No response 01-Mar-24 14:18:34 - INFO - Using default logging config without output log file 01-Mar-24 14:18:35 - INFO - Copying model version 'dev_machinelearning.alf.alf_lightgbm/2' to 'prod_bronzeinternal.grid.alf_lightgbm_prod' MlflowClient SRC: client.tracking_uri: databricks://dev-env client._registry_uri: databricks-uc://dev-env Credentials host: https://xxx.azuredatabricks.net/ username: None password: None token: xxx aws_sigv4: False auth: None ignore_tls_verification: None client_cert_path: None server_cert_path: None mlflow fluent: mlflow.tracking_uri: file:///workspaces/dataplatform/mlruns mlflow.registry_uri: file:///workspaces/dataplatform/mlruns MlflowClient DST: client.tracking_uri: databricks://prod-env client._registry_uri: databricks-uc://prod-env Credentials host: https://xxx.azuredatabricks.net/ username: None password: None token: xxx aws_sigv4: False auth: None ignore_tls_verification: None client_cert_path: None server_cert_path: None mlflow fluent: mlflow.tracking_uri: file:///workspaces/dataplatform/mlruns mlflow.registry_uri: file:///workspaces/dataplatform/mlruns Traceback (most recent call last): File "/workspaces/dataplatform/.github/workflows/scripts/export_model.py", line 27, in
src_model_version, dst_model_version = copy(
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/copy/copy_model_version.py", line 70, in copy
_create_registered_model(src_client, src_model_name, dst_client, dst_model_name, copy_permissions)
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/copy/copy_model_version.py", line 87, in _create_registered_model
if not utils.calling_databricks() or not copy_permissions:
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/common/utils.py", line 24, in calling_databricks
dbx_client = dbx_client or DatabricksHttpClient()
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/client/http_client.py", line 204, in init
super().init("api/2.0", host, token)
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/client/http_client.py", line 89, in init
(host, token) = mlflow_auth_utils.get_mlflow_host_token()
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/client/mlflow_auth_utils.py", line 24, in get_mlflow_host_token
_raise_exception(uri)
File "/home/vscode/.local/lib/python3.10/site-packages/mlflow_export_import/client/mlflow_auth_utils.py", line 42, in _raise_exception
raise MlflowExportImportException(
mlflow_export_import.common.MlflowExportImportException: {"message": "MLflow tracking URI (MLFLOW_TRACKING_URI environment variable) must be an HTTP URI: 'file:///workspaces/dataplatform/mlruns'.", "http_status_code": 401}
Proposed fix
In commen.mlflow_utils in set_experiment change to
passing the dbx_client
and also change in copy.copy_model_version in _create_registered_model
removing the calling_databricks. Alternative the client needs to be passed here for authentication.