Minyus / pipelinex

PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
https://pipelinex.readthedocs.io/
Other
221 stars 11 forks source link

How to specify an mlflow run_id #12

Closed edwardcjohnson closed 3 years ago

edwardcjohnson commented 3 years ago

Thank you for creating pipelinex. I appreciate how easy it is to swap out the standard kedro datasets with pipelinex's "MLflowDataSet".

It looks like it is possible to pass a run_id for mlflow in mlflow_config.py as shown in: example mlflow_config.py

Is it possible to conveniently set the MLflow run_id either as a parameter in parameters.yml or perhaps when i launch the run with kedro run? I am trying to avoid having to create an mlflow_config.py just for this one run_id parameter. Thank you!

Minyus commented 3 years ago

Thank you for your feedback!

I exposed run_id arg of mlflow.start_run and released PipelineX 0.7.0 just now. You can now specify run_id in pipelinex.MLflowBasicLoggerHook.

Currently, setting in parameters.yml or kedro run is not supported. (Setting in parameters.yml was removed to support kedro 0.17.x which changed the architecture design.)

It is in the backlog, but not in high priority. Contributions are welcome.