canonical / mlflow-operator

MLFlow Operators
Apache License 2.0
11 stars 10 forks source link

Error when creating the experiment with artifact_location - bucket #25

Open Barteus opened 2 years ago

Barteus commented 2 years ago

When creating the experiment using Python mlflow library and selecting artifact_location to the bucket (not a folder in the bucket) there is an error response of too many 500s. Code:

experiment_id = mlflow.create_experiment(name="Wine Experiments", artifact_location="s3://my-bucket")

Response:

MlflowException: API request to http://mlflow-server.kubeflow.svc.cluster.local:5000/api/2.0/mlflow/experiments/create failed with exception HTTPConnectionPool(host='mlflow-server.kubeflow.svc.cluster.local', port=5000): Max retries exceeded with url: /api/2.0/mlflow/experiments/create (Caused by ResponseError('too many 500 error responses'))

I could not find the root cause of it. This can be the upstream issue or some missing configuration or missing privileges.

Barteus commented 2 years ago

It works perfectly fine without artifact_location only artifact_location is corrupted.

ca-scribner commented 1 year ago

Let's try to reproduce this with our current MLFlow charm

ca-scribner commented 1 year ago

This might be a good thing to add to our integration tests, regardless of whether it is currently working

technologic27 commented 1 year ago

Hello, I am having a similar issue. Has it been resolved?