Closed jiria closed 2 years ago
I was able to reproduce this by installing the same version of K3s (v1.22.6+k3s1) which is their latest stable release. While Jobs (which are created by Akri) are named correctly, the Pods spun up by the Kubernetes [Job controller](https://kubernetes.io/docs/concepts/architecture/controller/) are not named as expected. It looks like the latest k3s is adding an extra
--1`
NAME READY STATUS RESTARTS AGE
pod/akri-debug-echo-discovery-daemonset-6ggfk 1/1 Running 0 6m22s
pod/akri-agent-daemonset-qx58w 1/1 Running 0 6m22s
pod/akri-controller-deployment-5465b887d7-swwbf 1/1 Running 0 6m22s
pod/akri-debug-echo-a19705-1-job--1-ghx6b 0/1 Completed 0 6m1s
pod/akri-debug-echo-8120fe-1-job--1-j72x5 0/1 Completed 0 6m2s
NAME COMPLETIONS DURATION AGE
job.batch/akri-debug-echo-8120fe-1-job 1/1 9s 6m2s
job.batch/akri-debug-echo-a19705-1-job 1/1 8s 6m1s
I then installed the same version of MicroK8s (v1.22.6-3+7ab10db7034594
) and was able to reproduce the weird formatting:
NAME READY STATUS RESTARTS AGE
pod/akri-debug-echo-discovery-daemonset-6f5m4 1/1 Running 0 40s
pod/akri-agent-daemonset-bxzgw 1/1 Running 0 40s
pod/akri-controller-deployment-5465b887d7-g9tvl 1/1 Running 0 40s
pod/akri-debug-echo-a19705-1-job--1-ts6n8 0/1 Completed 0 19s
pod/akri-debug-echo-8120fe-1-job--1-fld7p 0/1 Completed 0 20s
NAME COMPLETIONS DURATION AGE
job.batch/akri-debug-echo-a19705-1-job 1/1 11s 19s
job.batch/akri-debug-echo-8120fe-1-job 1/1 13s 20s
I was not able to reproduce this with the latest version of k3s that we run our end to end tests on, namely v1.21.5+k3s2
. Pods are named as expected (ie akri-debug-echo-8120fe-1-job-khzg7
). Output on a single node v1.21.5+k3s2
cluster:
NAME READY STATUS RESTARTS AGE
pod/akri-debug-echo-discovery-daemonset-p9x2f 1/1 Running 0 52s
pod/akri-controller-deployment-5465b887d7-xmgx2 1/1 Running 0 52s
pod/akri-agent-daemonset-frg65 1/1 Running 0 52s
pod/akri-debug-echo-a19705-1-job-brz64 0/1 Completed 0 37s
pod/akri-debug-echo-8120fe-1-job-nn5mr 0/1 Completed 0 37s
NAME COMPLETIONS DURATION AGE
job.batch/akri-debug-echo-a19705-1-job 1/1 13s 37s
job.batch/akri-debug-echo-8120fe-1-job 1/1 14s 37s
I then tried out MicroK8s 1.23
(v1.23.3-2+d441060727c463
) and got the expected non-buggy behavior:
NAME READY STATUS RESTARTS AGE
pod/akri-agent-daemonset-49m5z 1/1 Running 0 30s
pod/akri-controller-deployment-7675bdddf-7jnwj 1/1 Running 0 30s
pod/akri-debug-echo-discovery-daemonset-t46tk 1/1 Running 0 30s
pod/akri-debug-echo-8120fe-1-job-2w2bd 0/1 Completed 0 13s
pod/akri-debug-echo-a19705-1-job-djq7j 0/1 Completed 0 14s
NAME COMPLETIONS DURATION AGE
job.batch/akri-debug-echo-a19705-1-job 1/1 8s 14s
job.batch/akri-debug-echo-8120fe-1-job 1/1 7s 13s
I also tried out the latest K3s version v1.23.3+k3s1
and got the expected non-buggy behavior:
NAME READY STATUS RESTARTS AGE
pod/akri-debug-echo-discovery-daemonset-pcnks 1/1 Running 0 53s
pod/akri-agent-daemonset-58sts 1/1 Running 0 53s
pod/akri-controller-deployment-7675bdddf-mw8vg 1/1 Running 0 53s
pod/akri-debug-echo-8120fe-1-job-dg5p2 0/1 Completed 0 28s
pod/akri-debug-echo-a19705-1-job-7n2kh 0/1 Completed 0 29s
NAME COMPLETIONS DURATION AGE
job.batch/akri-debug-echo-8120fe-1-job 1/1 7s 28s
job.batch/akri-debug-echo-a19705-1-job 1/1 8s 29s
It looks like this is a bug with Kubernetes version 1.22.6
. It might be interesting to see if this was a known issue. Regardless, later versions have fixed it.
I think it is safe to remove this bug from Akri as (1) it is not naming the Pods only the jobs which are being named appropriately and (2) behaves as expected in the versions before and after 1.22
and (3) this syntax bug is not affecting Akri behavior.
Closing. See Kate's earlier comment.
Describe the bug Pods generated on behalf of job broker type contain two consecutive dashes as if there was some token missing.
Output of
kubectl get pods,akrii,akric -o wide
Kubernetes Version: [e.g. Native Kubernetes 1.19, MicroK8s 1.19, Minikube 1.19, K3s]
To Reproduce Steps to reproduce the behavior:
Expected behavior Would expect the generated pod name was either
akri-debug-echo-a19705-1-job-1-48s8q
orakri-debug-echo-a19705-1-job-something-1-48s8q
, but notakri-debug-echo-a19705-1-job--1-48s8q
.Logs (please share snips of applicable logs) Snippet from the controller log: