mlflow / mlflow

Open source platform for the machine learning lifecycle
https://mlflow.org
Apache License 2.0
18.78k stars 4.24k forks source link

[BUG] Loading Pyfunc artifact from pre- saved model uses Windows path format instead of linux. #11862

Open threaddy opened 6 months ago

threaddy commented 6 months ago

Issues Policy acknowledgement

Where did you encounter this bug?

Local machine

Willingness to contribute

Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.

MLflow version

System information

Describe the problem

I am trying to containerize a Pytorch ML model that is trained and saved locally using a custom wrapper of PythonModel to save some artifacts (scaler, hyperparameters, intermediate predictions)

However, when re-loading the model in the container, the artifact paths are clearly reloaded using a mix of Windows and Linux directory separators. This raises a FileNotFoundError exception.

What is a clean way to fix this behavior? Should I try to save the model differently during training? Or maybe a path fix should be implemented when moving across platforms

Tracking information

System information: Linux #1 SMP Thu Jan 11 04:09:03 UTC 2024 Python version: 3.11.9 MLflow version: 2.12.1 MLflow module location: /usr/local/lib/python3.11/site-packages/mlflow/init.py Tracking URI: http://localhost:5002 Registry URI: http://localhost:5002 MLflow environment variables: MLFLOW_HOST: 0.0.0.0 MLFLOW_PORT: 5002 MLFLOW_TRACKING_URI: http://localhost:5002 MLflow dependencies: Flask: 3.0.3 Jinja2: 3.1.3 alembic: 1.13.1 click: 8.1.7 cloudpickle: 3.0.0 docker: 7.0.0 entrypoints: 0.4 fastapi: 0.110.2 gitpython: 3.1.43 graphene: 3.3 gunicorn: 21.2.0 importlib-metadata: 7.1.0 markdown: 3.6 matplotlib: 3.8.4 numpy: 1.26.3 packaging: 23.2 pandas: 2.2.0 protobuf: 5.26.1 pyarrow: 15.0.2 pydantic: 2.6.4 pytz: 2024.1 pyyaml: 6.0.1 querystring-parser: 1.2.4 requests: 2.31.0 scikit-learn: 1.4.0 scipy: 1.12.0 sqlalchemy: 2.0.27 sqlparse: 0.5.0 uvicorn: 0.29.0

Code to reproduce issue

import mlflow

uri = mlflow.tracking.get_tracking_uri()
client = mlflow.tracking.MlflowClient()

registered_models = client.search_model_versions("")
latest_model_name = registered_models[0].name
latest_model_version = registered_models[0].version

model_uri = f"models:/{latest_model_name}/{latest_model_version}"
model_info = mlflow.models.get_model_info(model_uri)
loaded_model = mlflow.pyfunc.load_model(model_uri)

Stack trace

Traceback (most recent call last):
  File "/app/app_main.py", line 30, in <module>
    loaded_model = mlflow.pyfunc.load_model(model_uri)
  File "/usr/local/lib/python3.11/site-packages/mlflow/pyfunc/model.py", line 468, in _load_pyfunc
    context, python_model, signature = _load_context_model_and_signature(model_path, model_config)
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/mlflow/pyfunc/model.py", line 461, in _load_context_model_and_signature
    python_model.load_context(context=context)
  File "C:\Users\fgarb\Documents\0_PARA\1_projects\prj2_anom_container_deploy\train.py", line 172, in load_context
  File "/usr/local/lib/python3.11/site-packages/joblib/numpy_pickle.py", line 650, in load
    with open(filename, 'rb') as f:
         ^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp8ny0hf25/artifacts\\hparams.pkl'

Other info / logs

REPLACE_ME

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

WeichenXu123 commented 6 months ago

Thanks for reporting, you can use python os.sep or os.path.join or pathlib to make path separator supporting cross-platform.

github-actions[bot] commented 6 months ago

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.