Open majie86 opened 3 years ago
2021-08-18 22:22:22.595, INFO , [OkHttp https://192.168.56.106:6443/...] c.a.chaosblade.box.invoker.blade.kubeapi.ChaosBladeAttackChaosInvoker - 子任务运行中,检查 CRD 状态,NAME: 83fea9a39c0f4b48900303a212214696, PHASE: Error, 是否成功: false, 失败原因: cannot find container, please confirm if the container exists 2021-08-18 22:22:22.601, INFO , [EXPERIMENT-TASK-THREAD-1] c.a.c.box.service.task.stateless.KubernetesAttackActivityTaskHandler - 子任务运行中,任务ID: 1427999302198321153,阶段:ATTACK, 子任务ID: 1427999302332538882, 当前机器: [{"ip":"10.244.1.12","clusterId":1426913840968847361,"nodeName":"k8s-node1","namespace":"argocd","podName":"argocd-application-controller-0","machineId":1426914606064422914,"machineType":2}], 是否成功: false, 失败原因: cannot find container, please confirm if the container exists 2021-08-18 22:22:22.610, ERROR, [EXPERIMENT-TASK-THREAD-1] c.a.c.box.service.task.stateless.DefaultActivityTaskPhaseHandler - 子任务运行失败,任务ID: 1427999302198321153,阶段:ATTACK, 子任务ID: 1427999302332538882 com.alibaba.chaosblade.box.common.exception.BizException: cannot find container, please confirm if the container exists at com.alibaba.chaosblade.box.service.task.stateless.KubernetesAttackActivityTaskHandler.lambda$handle$1(KubernetesAttackActivityTaskHandler.java:119) at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907) at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) at java.base/java.lang.Thread.run(Thread.java:832)
经过源码分析,是因为持久化到数据库中时缺少了namespace,所以执行演练的时候会从default的namespace下寻找该pod造成的异常,解决办法是界面上手动填写namespace,保存后异常消失,演练正常执行,我个人补充了代码com.alibaba.chaosblade.box.service.impl.ExperimentServiceImpl中181行添加了pod信息保存,稍后提交PR申请review,感谢!
默认是获取选中pod数组中第一个pod(index=0),即web ui界面操作最后一次选中的那个pod
2021-08-18 22:22:22.595, INFO , [OkHttp https://192.168.56.106:6443/...] c.a.chaosblade.box.invoker.blade.kubeapi.ChaosBladeAttackChaosInvoker - 子任务运行中,检查 CRD 状态,NAME: 83fea9a39c0f4b48900303a212214696, PHASE: Error, 是否成功: false, 失败原因: cannot find container, please confirm if the container exists 2021-08-18 22:22:22.601, INFO , [EXPERIMENT-TASK-THREAD-1] c.a.c.box.service.task.stateless.KubernetesAttackActivityTaskHandler - 子任务运行中,任务ID: 1427999302198321153,阶段:ATTACK, 子任务ID: 1427999302332538882, 当前机器: [{"ip":"10.244.1.12","clusterId":1426913840968847361,"nodeName":"k8s-node1","namespace":"argocd","podName":"argocd-application-controller-0","machineId":1426914606064422914,"machineType":2}], 是否成功: false, 失败原因: cannot find container, please confirm if the container exists 2021-08-18 22:22:22.610, ERROR, [EXPERIMENT-TASK-THREAD-1] c.a.c.box.service.task.stateless.DefaultActivityTaskPhaseHandler - 子任务运行失败,任务ID: 1427999302198321153,阶段:ATTACK, 子任务ID: 1427999302332538882 com.alibaba.chaosblade.box.common.exception.BizException: cannot find container, please confirm if the container exists at com.alibaba.chaosblade.box.service.task.stateless.KubernetesAttackActivityTaskHandler.lambda$handle$1(KubernetesAttackActivityTaskHandler.java:119) at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907) at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) at java.base/java.lang.Thread.run(Thread.java:832)
经过源码分析,是因为持久化到数据库中时缺少了namespace,所以执行演练的时候会从default的namespace下寻找该pod造成的异常,解决办法是界面上手动填写namespace,保存后异常消失,演练正常执行,我个人补充了代码com.alibaba.chaosblade.box.service.impl.ExperimentServiceImpl中181行添加了pod信息保存,稍后提交PR申请review,感谢!