todaygood / openshift-lab

lab on openshift
0 stars 0 forks source link

kb_use_job #8

Open todaygood opened 6 years ago

todaygood commented 6 years ago

pi job

perl -Mbignum=bpi -wle "print bpi(2000)"

Test

activeDeadlineSeconds小于运行时间,则会把pod杀死,


apiVersion: batch/v1
kind: Job
metadata:
  name: pi-with-timeout
spec:
#  backoffLimit: 5
  activeDeadlineSeconds: 600
  template:
    metadata:
      name: pi
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never

运行之后,在pod run之后很快被删除(杀死)。

[root@ose0 ~]# oc get job -o wide 
NAME              DESIRED   SUCCESSFUL   AGE       CONTAINERS   IMAGES    SELECTOR
pi-with-timeout   1         0            3m        pi           perl      controller-uid=e5e16de7-9459-11e8-be1e-5254005e
84cc[root@ose0 ~]# oc get pod -a 
NAME              READY     STATUS      RESTARTS   AGE
ruby-ex-1-build   0/1       Completed   0          22h
ruby-ex-2-build   0/1       Completed   0          15h
ruby-ex-3-88vr4   1/1       Running     0          15h
ruby-ex-3-mxm75   1/1       Running     0          15h
ruby-ex-3-z4495   1/1       Running     0          15h

[root@ose0 ~]# oc describe job 
Name:                     pi-with-timeout
Namespace:                hello
Selector:                 controller-uid=e5e16de7-9459-11e8-be1e-5254005e84cc
Labels:                   controller-uid=e5e16de7-9459-11e8-be1e-5254005e84cc
                          job-name=pi-with-timeout
Annotations:              <none>
Parallelism:              1
Completions:              1
Start Time:               Tue, 31 Jul 2018 08:37:26 +0800
Active Deadline Seconds:  100s
Pods Statuses:            0 Running / 0 Succeeded / 1 Failed
Pod Template:
  Labels:  controller-uid=e5e16de7-9459-11e8-be1e-5254005e84cc
           job-name=pi-with-timeout
  Containers:
   pi:
    Image:  perl
    Port:   <none>
    Command:
      perl
      -Mbignum=bpi
      -wle
      print bpi(2000)
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type     Reason            Age              From            Message
  ----     ------            ----             ----            -------
  Normal   SuccessfulCreate  5m               job-controller  Created pod: pi-with-timeout-c6f4p
  Normal   SuccessfulDelete  3m               job-controller  Deleted pod: pi-with-timeout-c6f4p
  Warning  DeadlineExceeded  3m (x2 over 3m)  job-controller  Job was active longer than specified deadline

修改为600s之后,才看到了一个运行completed的Pod , 如文档

https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/ 所言

Job Termination and Cleanup When a Job completes, no more Pods are created, but the Pods are not deleted either. Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output. The job object also remains after it is completed so that you can view its status. It is up to the user to delete old jobs after noting their status. Delete the job with kubectl (e.g. kubectl delete jobs/pi or kubectl delete -f ./job.yaml). When you delete the job using kubectl, all the pods it created are deleted too.

By default, a Job will run uninterrupted unless a Pod fails, at which point the Job defers to the .spec.backoffLimit described above. Another way to terminate a Job is by setting an active deadline. Do this by setting the .spec.activeDeadlineSeconds field of the Job to a number of seconds.

apiVersion: batch/v1
kind: Job
metadata:
  name: pi-with-timeout
spec:
#  backoffLimit: 5
  activeDeadlineSeconds: 600
  template:
    metadata:
      name: pi
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never  

一个completed的pod, 用于查看日志。


[root@ose0 ~]# oc get job 
NAME              DESIRED   SUCCESSFUL   AGE
pi-with-timeout   1         1            10m
[root@ose0 ~]# oc get pod 
NAME                    READY     STATUS      RESTARTS   AGE
pi-with-timeout-kl7c2   0/1       Completed   0          11m
ruby-ex-1-build         0/1       Completed   0          22h
ruby-ex-2-build         0/1       Completed   0          16h
ruby-ex-3-88vr4         1/1       Running     0          15h
ruby-ex-3-mxm75         1/1       Running     0          15h
ruby-ex-3-z4495         1/1       Running     0          15h
[root@ose0 ~]# oc logs pi-with-timeout-kl7c2
3.1415 ...    
todaygood commented 6 years ago

测试 重启ose2节点

[root@ose0 ~]# oc get pods -o wide 
NAME                    READY     STATUS      RESTARTS   AGE       IP            NODE
pi-with-timeout-kl7c2   0/1       Completed   0          36m       10.130.0.6    ose1.cloud.genomics.cn
ruby-ex-1-build         0/1       Completed   0          22h       10.128.2.3    ose3.cloud.genomics.cn
ruby-ex-2-build         0/1       Completed   0          16h       10.131.0.8    ose2.cloud.genomics.cn
ruby-ex-3-88vr4         1/1       Running     0          16h       10.129.0.5    ose4.cloud.genomics.cn
ruby-ex-3-mxm75         0/1       Completed   0          16h       <none>        ose2.cloud.genomics.cn
ruby-ex-3-z4495         1/1       Running     0          16h       10.130.2.24   ose7.cloud.genomics.cn
[root@ose0 ~]# oc get pods -o wide 
NAME                    READY     STATUS      RESTARTS   AGE       IP            NODE
pi-with-timeout-kl7c2   0/1       Completed   0          36m       10.130.0.6    ose1.cloud.genomics.cn
ruby-ex-1-build         0/1       Completed   0          22h       10.128.2.3    ose3.cloud.genomics.cn
ruby-ex-2-build         0/1       Completed   0          16h       10.131.0.8    ose2.cloud.genomics.cn
ruby-ex-3-88vr4         1/1       Running     0          16h       10.129.0.5    ose4.cloud.genomics.cn
ruby-ex-3-mxm75         1/1       Running     1          16h       10.131.0.11   ose2.cloud.genomics.cn
ruby-ex-3-z4495         1/1       Running     0          16h       10.130.2.24   ose7.cloud.genomics.cn

发现上面的pod 从Running状态到completed状态, 再到Running状态。

为何 https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/ 说 Bare Pods When the node that a pod is running on reboots or fails, the pod is terminated and will not be restarted. However, a Job will create new pods to replace terminated ones. For this reason, we recommend that you use a job rather than a bare pod, even if your application requires only a single pod.

从oc describe pod ruby-ex-3-mxm75 的事件日志可以看出

image

这个pod是controller manager 发现网络failed, 把老的pod杀死,重新创建一个新的pod,只是名字还是用的原来的名字而已。