Closed susana-garcia closed 3 years ago
Hi @susana-garcia, I would need more details to help you. We need to determine why the pods are pending? kubectl describe pod ...
will show you events related to scheduling, with some details. The pod status may also give a reason in the conditions section. Next, logs from the proxy-scheduler and/or the delegate-scheduler may help.
@adrienjt thank you for your quick reply.
That was what I expected but:
$ echo $ARGO_CLUSTER
kind-cd
$ argo --context $ARGO_CLUSTER submit --serviceaccount argo-workflow https://raw.githubusercontent.com/admiraltyio/admiralty/master/examples/argo-workflows/blog-scenario-a-multicluster.yaml
Name: multicluster-parallel-wjdpm
Namespace: default
ServiceAccount: argo-workflow
Status: Pending
Created: Wed Dec 09 19:52:48 +0100 (now)
$ kubectl describe pod multicluster-parallel-wjdpm --context kind-cd
Error from server (NotFound): pods "multicluster-parallel-wjdpm" not found
The controller-manager:
$ kubectl describe pod admiralty-multicluster-scheduler-controller-manager-68b487c5mtx --context kind-cd -n admiralty
Name: admiralty-multicluster-scheduler-controller-manager-68b487c5mtx
Namespace: admiralty
Priority: 0
Node: cd-control-plane/172.19.0.2
Start Time: Wed, 09 Dec 2020 15:34:35 +0100
Labels: app.kubernetes.io/instance=admiralty
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=multicluster-scheduler
app.kubernetes.io/version=0.13.1
component=controller-manager
helm.sh/chart=multicluster-scheduler-0.13.1
pod-template-hash=68b487f689
Annotations: <none>
Status: Running
IP: 10.244.0.12
IPs:
IP: 10.244.0.12
Controlled By: ReplicaSet/admiralty-multicluster-scheduler-controller-manager-68b487f689
Containers:
controller-manager:
Container ID: containerd://268dd5ec5d59d666fec67f7723f60bec24078bd410cce96869b5c08247045c1a
Image: quay.io/admiralty/multicluster-scheduler-agent:0.13.1
Image ID: sha256:1efc05f72f2cbb96fffcd92e8a575ebe3bc2f0d2ab73ace948e5aa5c8a37f6d5
Ports: 9443/TCP, 10250/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Wed, 09 Dec 2020 15:34:38 +0100
Ready: True
Restart Count: 0
Environment:
SOURCE_CLUSTER_ROLE_NAME: admiralty-multicluster-scheduler-source
CLUSTER_SUMMARY_VIEWER_CLUSTER_ROLE_NAME: admiralty-multicluster-scheduler-cluster-summary-viewer
VKUBELET_POD_IP: (v1:status.podIP)
Mounts:
/tmp/k8s-webhook-server/serving-certs from cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from admiralty-multicluster-scheduler-token-txtfh (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
cert:
Type: Secret (a volume populated by a Secret)
SecretName: admiralty-multicluster-scheduler-cert
Optional: false
admiralty-multicluster-scheduler-token-txtfh:
Type: Secret (a volume populated by a Secret)
SecretName: admiralty-multicluster-scheduler-token-txtfh
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
The proxy-scheduler:
$ kubectl describe pod admiralty-multicluster-scheduler-proxy-scheduler-96bf4685cx224l --context kind-cd -n admiralty
Name: admiralty-multicluster-scheduler-proxy-scheduler-96bf4685cx224l
Namespace: admiralty
Priority: 0
Node: cd-control-plane/172.19.0.2
Start Time: Wed, 09 Dec 2020 15:34:36 +0100
Labels: app.kubernetes.io/instance=admiralty
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=multicluster-scheduler
app.kubernetes.io/version=0.13.1
component=proxy-scheduler
helm.sh/chart=multicluster-scheduler-0.13.1
pod-template-hash=96bf4685c
Annotations: checksum/config: 13124b8456b347b0e65de123c71ae62893206564731b738d1abf2aaf822459ee
Status: Running
IP: 10.244.0.13
IPs:
IP: 10.244.0.13
Controlled By: ReplicaSet/admiralty-multicluster-scheduler-proxy-scheduler-96bf4685c
Containers:
proxy-scheduler:
Container ID: containerd://75a0a07bf998a32d4c82ce28b9d7defdec6a58f5b5d90ac3432f6de651f37357
Image: quay.io/admiralty/multicluster-scheduler-scheduler:0.13.1
Image ID: sha256:77602cddf06b865655886c5f6577dd039c7f260590a6d3213266700757d9817f
Port: <none>
Host Port: <none>
Args:
--config
/etc/admiralty/proxy-scheduler-config
State: Running
Started: Wed, 09 Dec 2020 15:34:39 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/etc/admiralty from config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from admiralty-multicluster-scheduler-token-txtfh (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: admiralty-multicluster-scheduler
Optional: false
admiralty-multicluster-scheduler-token-txtfh:
Type: Secret (a volume populated by a Secret)
SecretName: admiralty-multicluster-scheduler-token-txtfh
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
The candidate-scheduler:
$ kubectl describe pod admiralty-multicluster-scheduler-candidate-scheduler-6fc4d95r4l --context kind-cd -n admiralty
Name: admiralty-multicluster-scheduler-candidate-scheduler-6fc4d95r4l
Namespace: admiralty
Priority: 0
Node: cd-control-plane/172.19.0.2
Start Time: Wed, 09 Dec 2020 15:21:31 +0100
Labels: app.kubernetes.io/instance=admiralty
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=multicluster-scheduler
app.kubernetes.io/version=0.13.1
component=candidate-scheduler
helm.sh/chart=multicluster-scheduler-0.13.1
pod-template-hash=6fc4d4dbf8
Annotations: checksum/config: 13124b8456b347b0e65de123c71ae62893206564731b738d1abf2aaf822459ee
Status: Running
IP: 10.244.0.8
IPs:
IP: 10.244.0.8
Controlled By: ReplicaSet/admiralty-multicluster-scheduler-candidate-scheduler-6fc4d4dbf8
Containers:
candidate-scheduler:
Container ID: containerd://ce6ee772a61051df0dddfab7ede83b39583a7026d880604be4d5bc6dd15855ce
Image: quay.io/admiralty/multicluster-scheduler-scheduler:0.13.1
Image ID: sha256:77602cddf06b865655886c5f6577dd039c7f260590a6d3213266700757d9817f
Port: <none>
Host Port: <none>
Args:
--config
/etc/admiralty/candidate-scheduler-config
State: Running
Started: Wed, 09 Dec 2020 15:21:37 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/etc/admiralty from config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from admiralty-multicluster-scheduler-token-txtfh (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: admiralty-multicluster-scheduler
Optional: false
admiralty-multicluster-scheduler-token-txtfh:
Type: Secret (a volume populated by a Secret)
SecretName: admiralty-multicluster-scheduler-token-txtfh
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
And the scheduler-restarter:
$ kubectl describe pod admiralty-multicluster-scheduler-restarter-7c5644bf89-qz545 --context kind-cd -n admiralty
Name: admiralty-multicluster-scheduler-restarter-7c5644bf89-qz545
Namespace: admiralty
Priority: 0
Node: cd-control-plane/172.19.0.2
Start Time: Wed, 09 Dec 2020 15:21:31 +0100
Labels: app.kubernetes.io/instance=admiralty
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=multicluster-scheduler
app.kubernetes.io/version=0.13.1
component=restarter
helm.sh/chart=multicluster-scheduler-0.13.1
pod-template-hash=7c5644bf89
Annotations: <none>
Status: Running
IP: 10.244.0.9
IPs:
IP: 10.244.0.9
Controlled By: ReplicaSet/admiralty-multicluster-scheduler-restarter-7c5644bf89
Containers:
restarter:
Container ID: containerd://47ec7261904f1faf9aa02fa5d97bffe0ec5d24bffc6e6bf5dd2d32cc15b1502e
Image: quay.io/admiralty/multicluster-scheduler-restarter:0.13.1
Image ID: sha256:4189b532debb3c58e942653dfbc862038449e6afc76a4b6c2a63ad71eff87814
Port: <none>
Host Port: <none>
State: Running
Started: Wed, 09 Dec 2020 15:21:37 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from admiralty-multicluster-scheduler-token-txtfh (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
admiralty-multicluster-scheduler-token-txtfh:
Type: Secret (a volume populated by a Secret)
SecretName: admiralty-multicluster-scheduler-token-txtfh
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
1) multicluster-parallel-wjdpm
is the name of the Argo workflow object. Pods created created by that workflow have different names, hence the "not found" error.
2) I don't need the events of the Admiralty control plane pods, but those of the workflow pods. I could use the logs of the Admiralty control plane pods though.
Ok, then I guess that the pods for the argo workflow object where not created:
$ kubectl get pods --context kind-cd --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
admiralty admiralty-multicluster-scheduler-candidate-scheduler-6fc4d95r4l 1/1 Running 0 5h19m
admiralty admiralty-multicluster-scheduler-controller-manager-68b487c5mtx 1/1 Running 0 5h6m
admiralty admiralty-multicluster-scheduler-proxy-scheduler-96bf4685cx224l 1/1 Running 0 5h6m
admiralty admiralty-multicluster-scheduler-restarter-7c5644bf89-qz545 1/1 Running 0 5h19m
cert-manager cert-manager-cainjector-fc6c787db-mn9wh 1/1 Running 2 5h38m
cert-manager cert-manager-d994d94d7-r5c95 1/1 Running 1 5h38m
cert-manager cert-manager-webhook-845d9df8bf-tsglt 1/1 Running 1 5h38m
kube-system coredns-f9fd979d6-pl99c 1/1 Running 0 5h50m
kube-system coredns-f9fd979d6-vgd2b 1/1 Running 0 5h50m
kube-system etcd-cd-control-plane 1/1 Running 0 5h50m
kube-system kindnet-2ccnn 0/1 Pending 0 5h6m
kube-system kindnet-4wkgt 1/1 Running 0 5h50m
kube-system kube-apiserver-cd-control-plane 1/1 Running 1 5h50m
kube-system kube-controller-manager-cd-control-plane 0/1 Running 2 5h50m
kube-system kube-proxy-5rvqh 0/1 Pending 0 5h6m
kube-system kube-proxy-hvntj 1/1 Running 0 5h50m
kube-system kube-scheduler-cd-control-plane 0/1 Running 2 5h50m
local-path-storage local-path-provisioner-78776bfc44-nsdh9 1/1 Running 2 5h50m
Sorry, not sure what pod logs you need from here then.
Okay, then, either there's a problem with your Argo install, or Admiralty's pod admission webhook is failing. What could be helpful:
Hi again,
So the problem was in the step of installing argo from the manifest
:
$ kubectl --context $ARGO_CLUSTER create ns argo
$ kubectl --context $ARGO_CLUSTER apply -n argo -f https://raw.githubusercontent.com/argoproj/argo/v2.2.1/manifests/install.yaml
namespace/argo created
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/workflows.argoproj.io created
serviceaccount/argo created
serviceaccount/argo-ui created
clusterrole.rbac.authorization.k8s.io/argo-aggregate-to-admin created
clusterrole.rbac.authorization.k8s.io/argo-aggregate-to-edit created
clusterrole.rbac.authorization.k8s.io/argo-aggregate-to-view created
clusterrole.rbac.authorization.k8s.io/argo-cluster-role created
clusterrole.rbac.authorization.k8s.io/argo-ui-cluster-role created
clusterrolebinding.rbac.authorization.k8s.io/argo-binding created
clusterrolebinding.rbac.authorization.k8s.io/argo-ui-binding created
configmap/workflow-controller-configmap created
service/argo-ui created
unable to recognize "https://raw.githubusercontent.com/argoproj/argo/v2.2.1/manifests/install.yaml": no matches for kind "Deployment" in version "apps/v1beta2"
unable to recognize "https://raw.githubusercontent.com/argoproj/argo/v2.2.1/manifests/install.yaml": no matches for kind "Deployment" in version "apps/v1beta2"
No deployment was created.
So I deleted first:
$ kubectl --context $ARGO_CLUSTER delete -n argo -f https://raw.githubusercontent.com/argoproj/argo/v2.2.1/manifests/install.yaml
And instead I used helm
:
$ helm upgrade --install argo argo/argo --kube-context $ARGO_CLUSTER --version 0.13.10 --namespace argo
And after that, it worked.
@adrienjt thank you for the support!
@adrienjt btw, this link in the argo docs is not working anymore: https://raw.githubusercontent.com/admiraltyio/multicluster-scheduler/master/config/samples/argo-workflows/_service-account.yaml
I used instead: https://raw.githubusercontent.com/admiraltyio/admiralty/master/examples/argo-workflows/_service-account.yaml
If you want I can open a PR with this change of the link and also with the update of the installation of argo using helm
.
That would be much appreciated @susana-garcia.
@adrienjt I've just realized that the steps that have the typos are not under the /docs
folder of this project, but in the blog that I can't update. Sorry about that :(
Hi, I followed the guide and tried to run a multi-cluster-scheduling using argo's example, but I always got that the pods are pending and nothing really happens:
More info: to make it simpler I only have two clusters:
kind-cd
andkind-eu
Also maybe some outputs:Please, let me know if I can provide more info.