flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.73k stars 649 forks source link

[BUG] pyflyte run does not work with self-signed certificates #4794

Open kdubovikov opened 9 months ago

kdubovikov commented 9 months ago

Describe the bug

I am using self-signed certificate for exposing HTTPS Flyte GRPC endpoint on a EKS cluster. My certificate request is the following:

[req]
distinguished_name = req_distinguished_name
x509_extensions = v3_req
prompt = no
[req_distinguished_name]
C = US
ST = WA
L = Seattle
O = Flyte
OU = IT
CN = something.com
emailAddress = kdubovikov@something.com
[v3_req]
keyUsage = keyEncipherment, dataEncipherment, digitalSignature
extendedKeyUsage = serverAuth
subjectAltName = @alt_names
[alt_names]
DNS.1 = something.com

I can not use real endpoint in CN since it's way larger than 64 symbols.

My config is:

admin:
  # For GRPC endpoints you might want to use dns:///flyte.myexample.com
  endpoint: dns:///XXXXX
  authType: Pkce
  insecure: false
  insecureSkipVerify: true
logger:
  show-source: true
  level: 6

Whenever I am trying to run a remote workflow, I get an error:

pyflyte run --remote -p poc -d development ./workflows/poc.py wf                                                          
Running Execution on Remote.
Failed with Exception Code: SYSTEM:Unknown
RPC Failed, with Status: StatusCode.UNAVAILABLE
    details: failed to connect to all addresses; last error: UNKNOWN: ipv4:XX.XX.XX.XX:443: Peer name internal-k8s-flyte-XXXX-XXXX.ap-south-1.elb.amazonaws.com is not in peer certificate
    Debug string UNKNOWN:Error received from peer  {grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:XX.XX.XX.XX:443: Peer name internal-k8s-flyte-XXXXX-XXXX.ap-south-1.elb.amazonaws.com is not in peer certificate", grpc_status:14, created_time:"2024-01-30T20:19:10.206646+04:00"}

Expected behavior

Pyflyte runs the workflow on a remote cluster

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

welcome[bot] commented 9 months ago

Thank you for opening your first issue here! 🛠

dosubot[bot] commented 9 months ago

🤖

The issue you're experiencing is likely due to the Python requests library, which Flyte uses under the hood, not trusting your self-signed certificates by default.

To resolve this issue, you can set the REQUESTS_CA_BUNDLE environment variable to the path of your certificate file. This will tell the requests library to trust your self-signed certificate.

Here's how you can do it:

  1. First, find the path to your certificate file. This will depend on your system configuration.

  2. Then, set the REQUESTS_CA_BUNDLE environment variable to the path of your certificate file. You can do this in your shell before running your Python script, like so:

export REQUESTS_CA_BUNDLE=/path/to/your/certificate.pem
  1. Now, when you run your Python script, the requests library should trust your self-signed certificate, and pyflyte run should work as expected.

Please note that this solution assumes that you're using a Unix-like system (like Linux or MacOS). If you're using Windows, the process to set environment variables is a bit different.

Regarding your question about Flyte supporting self-signed certificates for HTTPS GRPC endpoints, yes, it does. This is indicated by the SslOptions struct in the ServerSecurityOptions struct within the ServerConfig struct. The SslOptions struct has two fields: CertificateFile and KeyFile, which are used to specify the paths to the SSL certificate file and key file respectively.

Here is the relevant code:

type ServerSecurityOptions struct {
    Secure      bool       `json:"secure"`
    Ssl         SslOptions `json:"ssl"`
    UseAuth     bool       `json:"useAuth"`
    AuditAccess bool       `json:"auditAccess"`
    ...
}

type SslOptions struct {
    CertificateFile string `json:"certificateFile"`
    KeyFile         string `json:"keyFile"`
}

This means that you can use self-signed certificates by providing the paths to your certificate and key files in the SslOptions configuration. You can find this in the flyteadmin/pkg/config/config.go file.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

kdubovikov commented 9 months ago

I have tried REQUESTS_CA_BUNDLE proposal, it does not work. And ServerSecurityOptions looks to be a flyteadmin configuration option, while it looks like the problem is on pyflyte client end