Galileo-Galilei / kedro-mlflow-tutorial

A tutorial on how to use kedro-mlflow plugin (https://github.com/Galileo-Galilei/kedro-mlflow) to synchronize training and inference and serve kedro pipeline
37 stars 5 forks source link

No default MLFlow run to serve #3

Closed pypeaday closed 3 years ago

pypeaday commented 3 years ago

In globals.yml there is a default MLFlow run id to serve, run_id_to_serve: 59adbbd705a345b680639eb879dd04b7 however this doesn't exist so kedro-mlflow-tutorial run --pipeline=etl_instances breaks with the following

error:kedro.io.core.DataSetError: Failed while loading data from data set MlflowModelLoggerDataSet(artifact_path=kedro_mlflow_tutorial, flavor=mlflow.pyfunc, load_args={}, pyfunc_workflow=python_model, run_id=59adbbd705a345b680639eb879dd04b7, save_args={}).
Run '59adbbd705a345b680639eb879dd04b7' not found
Galileo-Galilei commented 3 years ago

Hi @nicpayne713, thanks for reporting the issue.

This sounds very strange, because when you run the pipeline etl_instances, the pipeline_inference_model of the catalog.yml should never be loaded since it is not used in this pipeline (this is the exact purpose of this tutorial by the way: you must run the etl_instances pipeline first, then the training pipeline, and finally you can reuse the inference pipeline (with the trained model) inside the user_app pipeline). It does not make sense that the etl_instance should be aware of whether the specified run in the user_app exists or not.

I notice that you use the kedro-mlflow-tutorial run --pipeline=etl_instances command. I guess that you have install the src/ folder as a package (say with pip install -e src). Could you tell me if everything is going on if you only use kedro run --pipeline=etl_instances (i.e. the default kedro command instead of the installed packaged main.py)? I will try to reproduce the issue on my own.

If I remove this parameter to make it empty, you will likely still have a Run '' not found error. If I add my own mlflow run inside the project, it does not make sense in regards of the tutorial where the goal is precisely to make people change this parameter manually. It should really not raise an error when running a pipeline that does not read aforementioned dataset.

Galileo-Galilei commented 3 years ago

I've figured out what's going on: when you run kedro-mlflow-tutorial run --pipeline=etl_instances command, you run the "run_package" entry poitn of the project, which ignores entirely the arguments you pass to it. As a result, it behaves as if you ran the "kedro-mlflow-tutorial run comand, which is equivalent to the kedro-mlflow-tutorial run --pipeline=__default__ command. Since this pipeline launches the user_app pipeline , you got the error.

It is weird that running the project as a package ignores the CLI arguments. This looks like a bug, and may have been fixed in more recent kedro versions, I need to check out (and update the tutorial with the most recent versions).

Running kedro run --pipeline=etl_instances behaves normally, and you get the expected result. You should modify the run_id manually to put the one you've trained.

pypeaday commented 3 years ago

I apologize for going dark after posting the issue. The command kedro run --pipeline=etl_instances did not work for me initially regardless of the directory I was in... however today it did. I have had similar issues with my terminal before (I've been transitioning to using vim and being more active in the terminal) as my setup is constantly changing and part of that includes how I use conda... I honestly can't say what I did differently today vs 2 weeks ago but the default command did seem to work ok. I'm finally back on working with mlflow in a project at work (the reason I was in here in the first place) and so if I come across anything else I'll open a new issue!

Thank you for being prompt! (and again my apologies for not returning the favor) Cheers

Galileo-Galilei commented 3 years ago

Just some follow up: theroretically, it should have worked with kedro-mlflow-tutorial run --pipeline=etl_instances command. However I've figured out it was a bug in the old versions of kedro they apparently fixed very recently: the CLI arguments were not used and the default pipeline was always ran (no matter which pipeline you pass through the CLI) as described above.

This should be fixed in kedro>=0.17.4 and the introduction of __main__.py instead of run.py file, but I haven't tried it.