bitnami / charts

Bitnami Helm Charts
https://bitnami.com
Other
8.81k stars 9.09k forks source link

MLFlow allow using other supported database backends as external database #25964

Closed frittentheke closed 3 months ago

frittentheke commented 3 months ago

Name and Version

bitnami/mlflow 1.0.6

What is the problem this feature will solve?

mlflow has support for different database types as backend stores (https://mlflow.org/docs/latest/tracking/backend-stores.html?highlight=mssql#supported-store-types). Currently the chart does not allow to configure those as the database dialect is statically set to postgresql.

What is the feature you are proposing to solve the problem?

I propose to add a variable to allow using the other supported database types as external database.

What alternatives have you considered?

No response

frittentheke commented 3 months ago

@carrodher I understand that my PR https://github.com/bitnami/charts/pull/25965 only exposes the configuration options regarding the used database engine and that container image does also need additional python libraries to support those other database engines.

I was about to push another PR to also extend the container image (https://github.com/bitnami/containers/blob/main/bitnami/mlflow/2/debian-12/Dockerfile) with e.g.

pymssql
pymysql
pyodbc
...

like mlFlow does intend to do for their own container image (https://github.com/mlflow/mlflow/pull/11669/files).

But apparently the Python libraries are already part of the COMPONENTS and packaged into the tar.gz files fetched from Stacksmith (https://github.com/bitnami/containers/blob/913d85fcaf8b8e7a7a719a6a0bb5bd2aff179928/bitnami/mlflow/2/debian-12/Dockerfile#L32C8-L32C14)

Is there some sort of build script or requirements script that I can extend to have the required Python libraries added or is that something you would need to do?

carrodher commented 3 months ago

Thank you for opening this issue and submitting the associated Pull Request. Our team will review and provide feedback. Once the PR is merged, the issue will automatically close.

Regarding the python version, yes, it is something handled internally by our automated test & release pipeline. If changing that version is required, feel free to comment it in the container's PR so the reviewer can test that change internally.

Your contribution is greatly appreciated!

act-mreeves commented 3 months ago

@frittentheke - we discuss this in https://github.com/bitnami/charts/issues/22992 though specifically that is requesting mysql access.

Edit: I am thinking possibly https://github.com/mlflow/mlflow/pull/11669/files is a result of my request in that issue.

frittentheke commented 3 months ago

Edit: I am thinking possibly https://github.com/mlflow/mlflow/pull/11669/files is a result of my request in that issue.

Maybe. But currently Bitnami is using their own image, thus the comment from @javsalgar (https://github.com/bitnami/charts/issues/22992#issuecomment-2009007629) about adding the required modules to their image to even support other DB engines / dialects.

act-mreeves commented 3 months ago

@frittentheke - So does that mean this won't work til this is merged: https://github.com/mlflow/mlflow/pull/11669/files ? Or perhaps it will work if you use your own custom mlflow image that has the necessary libraries installed?

frittentheke commented 3 months ago

@frittentheke - So does that mean this won't work til this is merged: https://github.com/mlflow/mlflow/pull/11669/files ?

This is a PR for the official MLflow container images. Bitnami is not using them, see https://github.com/bitnami/charts/issues/25964#issuecomment-2144674162.

Or perhaps it will work if you use your own custom mlflow image that has the necessary libraries installed?

Yes, custom images will certainly work. But since Bitnami build their own containers (https://github.com/bitnami/containers/tree/main/bitnami/mlflow) it would be nice to have them support all database engines / dialects that MLflow (and now the Helm chart) supports.

@act-mreeves could you kindly tell me what creates the packages on Stacksmith which are then used within the Dockerfile ...

How could I create a PR to update the Stack (https://bitnami.com/stack/mlflow) in order to add the required Python libs?