datarevenue-berlin / OpenMLOps

MIT License
703 stars 101 forks source link

Question: Why did you use mlflow? #79

Open roche-MH opened 3 years ago

roche-MH commented 3 years ago

As far as I know, Airflow has a higher degree of freedom than MLflow, know that many functions can be used. However, OpenMLOps uses MLflow, so I would like to know what part of MLflow was used for.

pipatth commented 3 years ago

Thank you for your question.

MLFlow is model registry so it is responsible for storing and versition ML models. We also chose it because it has a very good support for all major ML frameworks (e.g. Tensorflow, Scikit-learn).

Airflow, on the other hand, is more like a workflow orchestrator. Airflow takes care of scheduling tasks to be run (e.g. preprocess data, model training). I think Airflow is more comparable to our choice to use Prefect. Airflow is great but we love Prefect because it works with Dask, our distributed computing tool, out of the box (see https://examples.dask.org/applications/prefect-etl.html).

bernardolk commented 3 years ago

@roche-MH we also have an article that compares similar technologies against each other, you might find it interesting: https://www.datarevenue.com/en-blog/airflow-vs-luigi-vs-argo-vs-mlflow-vs-kubeflow