sertansenturk / cookiecutter-ds-docker

A Docker-based Data Science cookiecutter (for myself)
https://cookiecutter-ds-docker.readthedocs.io/
GNU Affero General Public License v3.0
14 stars 1 forks source link

MLFlow won't work, possibly due to some obsolete modules? #60

Open nebelgrau77 opened 2 years ago

nebelgrau77 commented 2 years ago

Hi,

I'm trying to get it all installed, but MLFlow won't work. The container seems to get started but then it disappears, and restarting it won't help. I looked at the output of the make command, and here it is for MLFlow:

mlflow_1    | Traceback (most recent call last):
mlflow_1    |   File "/usr/local/bin/mlflow", line 5, in <module>
mlflow_1    |     from mlflow.cli import cli
mlflow_1    |   File "/usr/local/lib/python3.7/site-packages/mlflow/__init__.py", line 31, in <module>
mlflow_1    |     import mlflow.tracking._model_registry.fluent
mlflow_1    |   File "/usr/local/lib/python3.7/site-packages/mlflow/tracking/__init__.py", line 8, in <module>
mlflow_1    |     from mlflow.tracking.client import MlflowClient
mlflow_1    |   File "/usr/local/lib/python3.7/site-packages/mlflow/tracking/client.py", line 8, in <module>
mlflow_1    |     from mlflow.entities import ViewType
mlflow_1    |   File "/usr/local/lib/python3.7/site-packages/mlflow/entities/__init__.py", line 6, in <module>
mlflow_1    |     from mlflow.entities.experiment import Experiment
mlflow_1    |   File "/usr/local/lib/python3.7/site-packages/mlflow/entities/experiment.py", line 2, in <module>
mlflow_1    |     from mlflow.entities.experiment_tag import ExperimentTag
mlflow_1    |   File "/usr/local/lib/python3.7/site-packages/mlflow/entities/experiment_tag.py", line 2, in <module>
mlflow_1    |     from mlflow.protos.service_pb2 import ExperimentTag as ProtoExperimentTag
mlflow_1    |   File "/usr/local/lib/python3.7/site-packages/mlflow/protos/service_pb2.py", line 18, in <module>
mlflow_1    |     from .scalapb import scalapb_pb2 as scalapb_dot_scalapb__pb2
mlflow_1    |   File "/usr/local/lib/python3.7/site-packages/mlflow/protos/scalapb/scalapb_pb2.py", line 35, in <module>
mlflow_1    |     serialized_options=None, file=DESCRIPTOR)
mlflow_1    |   File "/usr/local/lib/python3.7/site-packages/google/protobuf/descriptor.py", line 560, in __new__
mlflow_1    |     _message.Message._CheckCalledFromGeneratedFile()
mlflow_1    | TypeError: Descriptors cannot not be created directly.
mlflow_1    | If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
mlflow_1    | If you cannot immediately regenerate your protos, some other possible workarounds are:
mlflow_1    |  1. Downgrade the protobuf package to 3.20.x or lower.
mlflow_1    |  2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
mlflow_1    | 
mlflow_1    | More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Can it be updated in the Makefile of Dockerfile? I can't tell if it's related to MLFlow itself, or maybe Python?

sertansenturk commented 2 years ago

Hi there,

Thank you for pointing the issue out. You are spot on about the libraries getting a bit outdated. This particular one seems to be rooted in the implicit protobuf dependency (if I remember correctly) of mlflow.

I am a bit cramped with personal stuff right now, but I'd be very happy to review if you open a PR with a solution.

Otherwise, I am planning to update the repo next month with a lighter-weight setup, though I cannot promise an ETA on that.

nebelgrau77 commented 2 years ago

Hello,

I would love to try, but I don't know where to start. The reason I found out about it is because I'm trying to learn from the book "Machine Learning Engineering with MLflow", which has examples that at some point stem from your template, I believe. I've been trying to do things from scratch, and stumbled upon this problem. If you could point me to the right file, I could try :)

I don't understand what should be updated, is it MLflow (your minimal version is 1.8.*, the current one is 1.26), or one of the Jupyter components?