argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
14.49k stars 3.11k forks source link

executor plugins example from documentation does not work #13015

Open dgolja opened 1 month ago

dgolja commented 1 month ago

Pre-requisites

What happened/what did you expect to happen?

Executor_plugins example provided in the documentation does not work even after adjusting the default service account permissions.

To setup the initial environment I followed the quick guide steps and updated default service account RBAC permissions.

kubectl -n argo apply -f - <<EOF
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: agent
rules:
- apiGroups:
  - argoproj.io
  resources:
  - workflowtasksets
  verbs:
  - list
  - watch
- apiGroups:
  - argoproj.io
  resources:
  - workflowtasksets/status
  verbs:
  - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: argo-binding-agent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: agent
subjects:
- kind: ServiceAccount
  name: default
  namespace: argo
---
apiVersion: v1
kind: Secret
metadata:
  name: default.service-account-token
  annotations:
    kubernetes.io/service-account.name: default
type: kubernetes.io/service-account-token
EOF

Also it's odd that the logs are complaining about system:serviceaccount:argo:argo permissions, even If I am not setting the service account to argo in the Workflow. Based on the documentation it should use default.

Version

latest

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-executor-plugin-example-
spec:
  entrypoint: main
  templates:
    - name: main
      plugin:
        hello: { }

Logs from the workflow controller

time="2024-05-07T12:38:18.038Z" level=info msg="Processing workflow" Phase= ResourceVersion=3690 namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.040Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.040Z" level=info msg="Updated phase  -> Running" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.040Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.041Z" level=warning msg="[DEBUG] boundaryID was nil" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.041Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.041Z" level=info msg="Plugin node hello-executor-plugin-example-7b72f initialized Pending" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.041Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.041Z" level=info msg="Creating TaskSet" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.045Z" level=info msg=reconcileAgentPod namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.056Z" level=info msg="Created Agent pod" namespace=argo podName=hello-executor-plugin-example-7b72f-1340600742-agent workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.056Z" level=info msg=updateAgentPodStatus namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.056Z" level=info msg=assessAgentPodStatus namespace=argo podName=hello-executor-plugin-example-7b72f-1340600742-agent
time="2024-05-07T12:38:18.057Z" level=warning msg="error updating taskset" error="failed patching taskset: workflowtasksets.argoproj.io \"hello-executor-plugin-example-7b72f\" is forbidden: User \"system:serviceaccount:argo:argo\" cannot patch resource \"workflowtasksets/status\" in API group \"argoproj.io\" in the namespace \"argo\"" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:18.064Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=3695 workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:23.182Z" level=info msg="insignificant pod change" key=argo/hello-executor-plugin-example-7b72f-1340600742-agent
time="2024-05-07T12:38:28.056Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3695 namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:28.056Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:28.056Z" level=info msg=updateAgentPodStatus namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:28.056Z" level=info msg=assessAgentPodStatus namespace=argo podName=hello-executor-plugin-example-7b72f-1340600742-agent
time="2024-05-07T12:38:28.057Z" level=error msg="was unable to obtain node for hello-executor-plugin-example-7b72f-2166136261" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:28.057Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:28.057Z" level=info msg="Creating TaskSet" namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:28.075Z" level=info msg=reconcileAgentPod namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:28.075Z" level=info msg=updateAgentPodStatus namespace=argo workflow=hello-executor-plugin-example-7b72f
time="2024-05-07T12:38:28.075Z" level=info msg=assessAgentPodStatus namespace=argo podName=hello-executor-plugin-example-7b72f-1340600742-agent

Logs from in your workflow's wait container

error: container wait is not valid for pod hello-executor-plugin-example-7b72f-1340600742-agent
dgolja commented 1 month ago

Changing the runAsUser from 65534 to 1000 in the plugin.yaml executor_plugins documentation fixed the issue.

I will create an PR with the updated documentation and add some more noes about the RBAC expectations for the SA running those tasks.

Hopefully this will save some time to the next one trying the examples from the documentation.

agilgur5 commented 1 month ago

Also it's odd that the logs are complaining about system:serviceaccount:argo:argo permissions, even If I am not setting the service account to argo in the Workflow. Based on the documentation it should use default.

In the code it seems to use whatever SA the Workflow has set. You haven't set it in your Workflow (or workflowDefaults I assume), so it should indeed use default 🤔

dgolja commented 1 month ago

In the code it seems to use whatever SA the Workflow has set. You haven't set it in your Workflow (or workflowDefaults I assume), so it should indeed use default 🤔

Yes, I thought the same, so I'm not sure why I encountered that error. I will investigate it further when I have more time.