flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.78k stars 659 forks source link

[Docs] Causing errors when using --remote #5663

Open Prageeth-Wickramaarachchi opened 3 months ago

Prageeth-Wickramaarachchi commented 3 months ago

Description

I am new to this, when I try to run this locally using following command following error occured. pyflyte run --remote -p my-project -d development hello_world.py hello_world_wf

HTTPConnectionPool(host='minio.flyte.svc.cluster.local', port=9000): Max retries exceeded with url: /my-s3-bucket/my-project/development/VZ4A4GREPN5DV2WDST5LW4PCHQ%3D%3D%3D%3D%3D%3D/script_mode.tar.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=minio%2F20240816%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240816T015639Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=content-md5%3Bhost%3Bx-amz-meta-flytecontentmd5&X-Amz-Signature=69c39070f6cdf8d10847564682eebdb2f149832fcb8098bee0b5e47b15bb2cc6 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb2fcf713c0>: Failed to establish a new connection: [Errno -2] Name or service not known'))

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

welcome[bot] commented 3 months ago

Thank you for opening your first issue here! 🛠

runllm[bot] commented 3 months ago

The error you're encountering is likely due to a misconfiguration in the Flyte setup, specifically related to the MinIO service endpoint. According to this GitHub issue, the error "Failed to establish a new connection" often occurs when the MinIO service is not accessible from the client.

To resolve this, ensure that your ~/.flyte/config.yaml file is correctly configured. The storage.connection.endpoint should point to the correct MinIO service endpoint. For example:

admin:
  endpoint: dns:///flyte.myexample.com:30081
  authType: Pkce
  insecure: true
logger:
  show-source: true
  level: 0
storage:
  connection:
    access-key: minio
    auth-type: accesskey
    disable-ssl: true
    endpoint: http://minio.flyte.svc.cluster.local:9000
    region: us-east-1
    secret-key: miniostorage
  type: minio
  container: "my-s3-bucket"
  enable-multicontainer: false

Would you like more details on configuring Flyte or troubleshooting this issue?

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @runllm to ask follow-up questions. See runllm.com for more!

Prageeth-Wickramaarachchi commented 3 months ago

I have changed the config-sandbox.yaml file, but still getting the same error

pingsutw commented 3 months ago

@Prageeth-Wickramaarachchi Could you share your config file?

Prageeth-Wickramaarachchi commented 2 months ago

@pingsutw this is the contents of the two configuration files

config.yaml admin: endpoint: dns:///localhost:30080 insecure: true

config-sandbox.yaml admin: endpoint: localhost:30080 insecure: true storage: connection: endpoint: http://localhost:30002 access-key: minio secret-key: miniostorage

Prageeth-Wickramaarachchi commented 2 months ago

@eapolinario can you help to resolve this?

pingsutw commented 2 months ago

@Prageeth-Wickramaarachchi, could you remove the storage config and try it again? I think it should work.

storage:
connection:
endpoint: http://localhost:30002/
access-key: minio
secret-key: miniostorage
Prageeth-Wickramaarachchi commented 2 months ago

@pingsutw I tried by removing it when I run the command flytectl demo start it again change the config file

eapolinario commented 2 months ago

@Prageeth-Wickramaarachchi , I couldn't repro this yet. Can you confirm that the k3s cluster is active and all pods in the flyte namespace are healthy, specifically the minio pod?

davidmirror-ops commented 1 month ago

@Prageeth-Wickramaarachchi how did you install Flyte? Is it just flytectl demo start or do you have any other Flyte deployment? From the error, it points to a service minio.flyte.svc.cluster.local that carries very specific configurations: like a K8s service called minio created on a flyte namespace, and that's not the case with the sandbox.

akkolyth commented 1 month ago

I have same issue,

image

(1) pyflyte run --remote hello_world.py my_wf

(2)

❯ kubectl get pods -n flyte
NAME                            READY   STATUS    RESTARTS      AGE
flyte-binary-78c64d4c46-ncp5s   1/1     Running   3 (14m ago)   7h14m
minio-65b696c89-4sv2c           1/1     Running   3 (14m ago)   10h
postgres-7456d5c788-h65wn       1/1     Running   4 (14m ago)   7h28m

(3) k8s is healthy (checked logs in Lens and everything is OK with MinIO)

❯ kubectl get services -n flyte
NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                         AGE
flyte-binary-grpc      ClusterIP   10.43.36.111    <none>        8089/TCP                        10h
flyte-binary-http      ClusterIP   10.43.146.229   <none>        8088/TCP                        10h
flyte-binary-webhook   ClusterIP   10.43.17.138    <none>        443/TCP                         10h
minio                  NodePort    10.43.109.88    <none>        9000:30084/TCP,9001:30088/TCP   10h
postgres               NodePort    10.43.24.203    <none>        5432:30089/TCP                  10h

(4) local IP translation is set

❯ ping minio.flyte.svc.cluster.local
PING minio.flyte.svc.cluster.local (127.0.1.1) 56(84) bytes of data.
64 bytes from akkolyth. (127.0.1.1): icmp_seq=1 ttl=64 time=0.027 ms
64 bytes from akkolyth. (127.0.1.1): icmp_seq=2 ttl=64 time=0.042 ms
64 bytes from akkolyth. (127.0.1.1): icmp_seq=3 ttl=64 time=0.033 ms
64 bytes from akkolyth. (127.0.1.1): icmp_seq=4 ttl=64 time=0.025 ms

(5)

work/k8s/flyte via 🐍 v3.10.12
❯ echo $FLYTECTL_CONFIG
/home/akkolyth/.flyte/config.yaml
work/k8s/flyte via 🐍 v3.10.12
❯ cat /home/akkolyth/.flyte/config.yaml
admin:
  endpoint: dns:///localhost:8089
  authType: Pkce
  insecure: true
logger:
  show-source: true
  level: 0
storage:
  type: minio
  connection:
    endpoint: http://minio.flyte.svc.cluster.local:9000
    access-key: minio
    auth-type: accesskey
    secret-key: miniostorage
    disable-ssl: true%

At first glance, it looks like the client is not passing credentials to MinIO as expected.

akkolyth commented 1 month ago

Repro should be straight forward, e.g setup fresh k3s cluster and follow this guide.

davidmirror-ops commented 1 month ago

@akkolyth thanks for providing info. Could you try removing the storage section from your config.yaml. Not sure how it is interacting with the Helm values used in the guide