mlflow / mlflow

Open source platform for the machine learning lifecycle
https://mlflow.org
Apache License 2.0
18.33k stars 4.14k forks source link

[BUG] mlflow::mlflow_download_artifacts_from_uri() does not remove all relevant piece of the output #6280

Closed nathaneastwood closed 2 years ago

nathaneastwood commented 2 years ago

Willingness to contribute

Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.

MLflow version

1.27.0

System information

Describe the problem

I cannot load a model which I have previously logged.

Tracking information

No response

Code to reproduce issue

mod <- carrier::crate(
  .fn = function(formula, data) stats::lm(formula = formula, data = data),
  formula, data
)

mlflow::mlflow_log_model(model = mod, artifact_path = "myCoolMod")
# 2022/07/20 14:42:35 INFO mlflow.store.artifact.cli: Logged artifact from local dir C:/.../Rtmp44ZXKk/myCoolMod to artifact_path=myCoolModRoot URI: file://C:/.../mlruns/0/bf154459eeca42a4b4f4e811ce615897/artifacts
# # A tibble: 2 x 3
#   path                is_dir file_size
#   <chr>               <lgl>      <int>
# 1 myCoolMod/MLmodel   FALSE        176
# 2 myCoolMod/crate.bin FALSE        309

mlflow::mlflow_load_model(
  model_uri = "file://C:/Users/EHYLM/work/mlruns/0/bf154459eeca42a4b4f4e811ce615897/artifacts/myCoolMod",
  flavor = "crate"
)
# Error in file(file, "rt", encoding = fileEncoding) : 
#   cannot open the connection
# In addition: Warning message:
# In file(file, "rt", encoding = fileEncoding) :
#   cannot open file 'C:/.../mlruns/0/bf154459eeca42a4b4f4e811ce615897//MLmodel': Invalid argument

I took a look under the hood and there are a couple of things going on here. Firstly, mlflow:::mlflow_download_artifacts_from_uri() is called and returns

model_path
# [1] "C:\\...\\mlruns\\0\\bf154459eeca42a4b4f4e811ce615897\\artifacts\\myCoolMod\r"

As you can see, there has been a \r appended here. If you step into this function, there is a \n too but this is removed here. I think this is a bug.

If we carry on stepping through the mlflow::mlflow_load_model(), the following is called (here)

fs::path(model_path, "MLmodel")
# /MLmodel/.../work/mlruns/0/bf154459eeca42a4b4f4e811ce615897/artifacts/myCoolMod

As you can see, rather than appending the MLmodel, it has instead prepended it. This may well be a bug within the fs package.

Other info / logs

No response

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

nathaneastwood commented 2 years ago

Looking at this some more, the issue indeed seems to come from the \r.

model_path2 <- "C:\\...\\mlruns\\0\\bf154459eeca42a4b4f4e811ce615897\\artifacts"
fs::path(model_path2, "MLmodel")
# C:/.../mlruns/0/bf154459eeca42a4b4f4e811ce615897/artifacts/MLmodel