aws-samples / eks-kubeflow-workshop

Kubeflow workshop on EKS. Mainly focus on AWS integration examples. Please go check kubeflow website http://kubeflow.org for other examples
Apache License 2.0
96 stars 56 forks source link

MLflow Server EKS Service API 500 #55

Closed JoshuaMVitullo closed 4 years ago

JoshuaMVitullo commented 4 years ago

Hello,

I'm following the steps given in the mlflow tracking notebook and I was able to deploy mlflow and port forward to view the UI. The issue I'm having now is logging from a jupyter notebook that is running on another pod on my eks cluster.

I'm trying to execute this code on my notebook:

import mlflow

print("Setting Tracking Server")
tracking_uri = "http://mlflow-tracking-server.default.svc.cluster.local:5000"

mlflow.set_tracking_uri(tracking_uri)

test_str = 'Hello, World!'
print(test_str)
mlflow.log_param('test_str', test_str)

print("Logging Artifact")
mlflow.log_artifact('/home/user/mlflow-example-artifact.png')

print("DONE")

But I keep getting the following error:

2020/04/21 19:42:53 ERROR mlflow.utils.rest_utils: API request to http://mlflow-tracking-server:5000/api/2.0/mlflow/runs/log-parameter failed with code 500 != 200, retrying up to 0 more times. API response body: {"error_code": "RESOURCE_DOES_NOT_EXIST", "message": "Run 'e633f09fa75c4d4d915b994244c8e2c8' not found"}

Is there anything else I could have missed?

Jeffwan commented 4 years ago

@TheVillageFool Can you check your pod status? Is mlflow pod running?

JoshuaMVitullo commented 4 years ago

Yes the pod was running. I actually tore down the jupyterhub pod and the mlflow pod and recreated them and now it works. But another issue has come up where I'm trying to use the mlflow ui to view each entry and I get an "Oops! Something went wrong." error page when I select anything on the UI.