Netflix / metaflow-service

:rocket: Metadata tracking and UI service for Metaflow!
http://www.metaflow.org
Apache License 2.0
193 stars 71 forks source link

Metadata request (/flows/<flowname>) failed (code 502) #434

Open michellewehr opened 3 months ago

michellewehr commented 3 months ago

I'm getting the following error:

 Metaflow 2.2.7 executing AnalysisFlow for user:ssm-user
 Validating your flow...
     The graph looks good!
 Creating local datastore in current directory 
 Bootstrapping conda environment...(this could take a few minutes)
     Metaflow service error:
     Metadata request (/flows/AnalysisFlow) failed (code 502): <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> 
     <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
         <head>
             <title>The page is temporarily unavailable</title>
             ....

Would this be indicative that something is up with the METAFLOW_SERVICE_URL or what is returning the 502 error? I am getting the same error when I hit my APIs that use metaflow API client to grab artifacts

michellewehr commented 2 months ago

We found out the source! postgres 15 has the default parameter group force_ssl set to true whereas our postgres 11 default parameter did not. We updated our metaflow to version 2.4.12 so that we can pass cert to docker start up/ db connection.

Looking at the metadata service environmental variables

ssl_mode = os.environ.get("MF_METADATA_DB_SSL_MODE")
ssl_cert_path = os.environ.get("MF_METADATA_DB_SSL_CERT_PATH")
ssl_key_path = os.environ.get("MF_METADATA_DB_SSL_KEY_PATH")
ssl_root_cert_path = os.environ.get("MF_METADATA_DB_SSL_ROOT_CERT")

Do we include ssl_root_cert_path in our run docker command?

And if so-- where is this running from? What would the path look like-- we run from a repo that exists inside our ec2 instance, yet when I passed a ssl_root_cert_path with a value /home/ec2-user/repo and path where file exists/ it can't find the path...