Add support for MLFlow - Githubissues

mgoeminne commented 4 years ago

In order to improve the usability of FADI when deployed for Machine Learning / Data Science projects, a support for MLFlow should be added.

MLFlow is a relatively recent, open source project from Databricks for storing and managing metrics that relate to ML models. Due to its loose coupling, this tool can be used with a large set of ML libraries.

From the user's point of view, MLFlow is essentially a REST API for submitting quality metrics, plus a Web application for managing them.

Is your feature request related to a problem? Please describe. No, it's a suggestion for an extension improving the functional coverage of FADI instances.

Describe the solution you'd like Helm charts should be added to FADI, in order to be able to deploy and exploit an instance of MLFlow.

Describe alternatives you've considered KubeFlow looks like a "natural" alternative, but it only focuses on the Tensorflow framework, which makes it more specific.

Additional context N/A

mgoeminne commented 4 years ago

@Maher-badri

Sellto commented 4 years ago

Back on the integration of MLFlow in FADI.

Tests were carried out with an existing helm chart from MLFlow, but this did not offer certain essential configurations for its integration into FADI. We have improved this chart to meet our requirements. (It is now available in the CETIC helm repository).

The use case that we deployed is the use of MLFlow with the following modules present in FADI: a PostgreSQL database (saving of metrics), Minio (saving of artifacts), jupyterHub (for launching experiments) and OpenLdap (for user management). Several observations can be made:

The integration is functional: It was possible to carry out a simple experiment in jupyterhub, and to recover the metrics and the artifacts.

But some rather negative points deserve to be raised:

MLFlow does not have user management. Can we imagine that this is a security breach?
It is essential to define the S3 credentials in jupyterHub in the form of three environment variables (AWS_SECRET_ACCESS_KEY, AWS_ACCESS_KEY_ID, MLFLOW_S3_ENDPOINT_URL). A question therefore arises: which service communicates with minio? the consumer? the server?

On these findings, should MLFlow be included in FADI? or can it be used simply via the helm MLFlow chart created?

banzo commented 4 years ago

MLFlow does not have user management. Can we imagine that this is a security breach?

We could rely on the git/S3 credentials for this I guess.

should MLFlow be included in FADI? or can it be used simply via the helm MLFlow chart created?

After discussion with @mgoeminne, I would say that it makes sense, the need is confirmed. Next steps would be to integrate the chart (default: false) in the fadi chart and provide a userguide.

I am thinking we might want to adopt some kind of "incubator" approach where we have several tiers of support for FADI services.

It is essential to define the S3 credentials in jupyterHub in the form of three environment variables (AWS_SECRET_ACCESS_KEY, AWS_ACCESS_KEY_ID, MLFLOW_S3_ENDPOINT_URL). A question therefore arises: which service communicates with minio? the consumer? the server?

I'd say both should be possible, which one would make more sense/be the simplest to implement? NB: https://kubernetes.io/docs/concepts/configuration/secret/

Sellto commented 4 years ago

MLFlow is now available in FADI.

We are working on a practical usecase that use MLFlow, the result will be a documentation that the FADI users will can use to properly use this new ML tools.

banzo commented 4 years ago

Reopening this until we have some basic doc and ideally a full example.

cetic / fadi

Add support for MLFlow #90