kestra-io / plugin-kubernetes

https://kestra.io/plugins/plugin-kubernetes/
Apache License 2.0
5 stars 6 forks source link

`Failed to call ‘uploadInputFiles’` if the pod is in different node than kestra worker in EKS #148

Open surya9teja opened 1 month ago

surya9teja commented 1 month ago

Expected Behavior

I have a SQS trigger and when a new message flows into the queue, it will convert into .jsonl and pass the file uri as inputFiles to kubernetes.PodCreate. The file will be accessed inside the pod and processed.

Actual Behaviour

When I pass the nodeSelectors and tolerations to the kubernetes pod which will be deployed into different node (Not same as the kestra-worker deployed). Because of the kestra and task pod is in different node. busy-box image is failed to upload the file that I am trying to pass it via flow.

But When I removed the node selectors and toleration, the inputFile upload works fine as it intended. From my observation it is only failed if kestra and newly creating task pod not in the same node. By the way I use Karpenter to scale the EKS nodes up and down dynamically (Just passing the info if it is anything related to it).

Steps To Reproduce

the error log for task creating pod and failing

2024-09-17T19:19:27.693Z INFO Pod 'microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk' is created 
2024-09-17T19:19:27.708Z DEBUG Received action 'ADDED' on [Type: Pod, Namespace: dev, Name: microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk, Uid: c19ebc8e-e92d-4791-a60e-f3dfc61f18e8, Phase: Pending]
2024-09-17T19:20:06.043Z DEBUG Received action 'MODIFIED' on [Type: Pod, Namespace: dev, Name: microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk, Uid: c19ebc8e-e92d-4791-a60e-f3dfc61f18e8, Phase: Pending]
2024-09-17T19:20:06.094Z DEBUG Received action 'MODIFIED' on [Type: Pod, Namespace: dev, Name: microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk, Uid: c19ebc8e-e92d-4791-a60e-f3dfc61f18e8, Phase: Pending]
2024-09-17T19:20:09.811Z DEBUG Received action 'MODIFIED' on [Type: Pod, Namespace: dev, Name: microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk, Uid: c19ebc8e-e92d-4791-a60e-f3dfc61f18e8, Phase: Pending]
2024-09-17T19:20:10.128Z DEBUG Failed to call 'uploadInputFiles'
2024-09-17T19:20:11.684Z DEBUG Failed to call 'uploadMarker'
2024-09-17T19:20:13.034Z DEBUG Failed to call 'uploadMarker'
2024-09-17T19:20:15.318Z DEBUG Failed to call 'uploadMarker'
2024-09-17T19:20:15.331Z DEBUG Received close on [Type: PodWatcher]
2024-09-17T19:20:15.345Z INFO Pod 'microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk' is deleted 
2024-09-17T19:20:15.350Z TRACE io.kestra.core.utils.RetryUtils$RetryFailed: Stop retry, attempts 3 elapsed after 3 seconds
    at io.kestra.core.utils.RetryUtils$Instance.lambda$exceptionFallback$4(RetryUtils.java:153)
    at dev.failsafe.internal.FallbackImpl.apply(FallbackImpl.java:58)
    at dev.failsafe.internal.FallbackExecutor.lambda$apply$0(FallbackExecutor.java:62)
    at dev.failsafe.SyncExecutionImpl.executeSync(SyncExecutionImpl.java:187)
    at dev.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:376)
    at dev.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:112)
    at io.kestra.core.utils.RetryUtils$Instance.wrap(RetryUtils.java:144)
    at io.kestra.core.utils.RetryUtils$Instance.run(RetryUtils.java:129)
    at io.kestra.plugin.kubernetes.services.PodService.withRetries(PodService.java:147)
    at io.kestra.plugin.kubernetes.services.PodService.uploadMarker(PodService.java:173)
    at io.kestra.plugin.kubernetes.AbstractPod.uploadInputFiles(AbstractPod.java:84)
    at io.kestra.plugin.kubernetes.PodCreate.run(PodCreate.java:193)
    at io.kestra.plugin.kubernetes.PodCreate.run(PodCreate.java:39)
    at io.kestra.core.runners.WorkerTaskThread.doRun(WorkerTaskThread.java:76)
    at io.kestra.core.runners.AbstractWorkerThread.run(AbstractWorkerThread.java:57)

2024-09-17T19:20:15.350Z ERROR Stop retry, attempts 3 elapsed after 3 seconds

Environment Information

Example flow

id: dev-boa-etl-part-1
namespace: microform.boa

triggers:
  - id: trigger
    type: io.kestra.plugin.aws.sqs.Trigger
    accessKeyId: "{{kv(key='aws_access_key', errorOnMissing=true)}}"
    secretKeyId: "{{kv(key='aws_secret_key', errorOnMissing=true)}}"
    region: "eu-west-2"
    serdeType: STRING
    queueUrl: "queueurl"
    maxRecords: 10
    maxDuration: PT10S

tasks:
  - id: to_json
    type: io.kestra.plugin.serdes.json.IonToJson
    from: "{{ trigger.uri }}"

  - id: log
    type: io.kestra.plugin.core.log.Log
    message: "{{read(outputs.to_json.uri)}}"

  - id: async_textract_job
    type: io.kestra.plugin.kubernetes.PodCreate
    namespace: dev
    inputFiles: 
      data.jsonl: "{{outputs.to_json.uri}}"
    metadata:
      labels:
        company: microform.boa
        task: boa-etl-pipeline-part-1
    waitRunning: PT1H
    waitUntilRunning: PT15M
    spec:
      containers:
        - name: boa-etl-pipeline-part-1
          image: 1tw678125321685.dkr.ecr.eu-west-2.amazonaws.com/boaml/etl:v1.0
          command:
            - python 
            - textract.py
            - '--f'
            - "{{workingDir}}/data.jsonl"
      volumeMounts:
        - name: environment-vars
          mountPath: /app/.env
          subPath: .env
          readOnly: true
      imagePullPolicy: IfNotPresent
      nodeSelector:
        resource-type: private-cpu
      tolerations:
        - key: private/cpu
          operator: Exists
          effect: NoSchedule
      restartPolicy: OnFailure
      volumes:
        - name: environment-vars
          secret:
            secretName: boa-etl-env-vars
surya9teja commented 1 month ago

Update: Kind of found the potential culprit.

Before that, I have checked and think of potential scenarios

  1. Kestra unable to access the pod in another node: which can be ruled out because I have created a clusterrole with full access to all cluster and bind it to the kestra service account.
  2. Network problem between nodes: I had security groups placed to allow all communication between nodes it works because I check the ping from node-1's pod-A to node-2's pod-B, it was successful.
  3. At last, Init container in this case busy-box to upload the files is started before the node is ready when the node is provisioned freshly.

And the last one is the main cause of filesupload failed. When I add a dummy init container in flow with just delay of 10s, the file upload successfully.

My current setup is in EKS, I use Karpenter to scale my nodes up and down dynamically based on node selectors and resources required for the pods. So when the flow got trigger via SQS message it will try to create a pod which should be deployed in a specific node. That specific node is not readily available it will be provisioned by Karpenter and then pod tries to deploy in that newly provisional node. During this the init conatiner i.e fileupload container tries to upload file way before node gets ready and the retry interval is 0.5sec with 3 retries which kind of exhausted the retries and make the whole file upload failed.

To solve this I have added a dummy init container that just runs and sleep for 10s and then proceed with fileupload conatiner to task pod.

sample flow

- id: test_pod_sqs
    type: io.kestra.plugin.kubernetes.PodCreate
    namespace: staging
    inputFiles: 
      data.jsonl: "{{outputs.to_json.uri}}"
    metadata:
      labels:
        company: microform.boa
        task: boa-etl-pipeline-part-1
    waitRunning: PT1H
    waitUntilRunning: PT30M
    spec:
      initContainers:
        - name: init-delay
          image: busybox
          command:
            - "/bin/sh"
            - "-c"
            - |
              echo 'Waiting'
              sleep 10
              echo 'Ready successfully'
      containers:
        - name: unittest
          image: debian:stable-slim
          command:
            - cat
            - "{{workingDir}}/data.jsonl"
      nodeSelector:
        resource-type: private-cpu
      tolerations:
        - key: private/cpu
          operator: Exists
          effect: NoSchedule
      restartPolicy: Never

I am not sure how the backend of readiness check works in the init conatiner but if you can give us access to modify the fileSidecar configuration of certain things like changing sleep or no. of retries. Or Need a better way of finding readiness of node, that would be helpful but for now this trick does the job.

I hope this helps. Let me know if you want more information on this.

anna-geller commented 1 month ago

@loicmathieu only tagged so you can check 👍