chaosblade-io / chaosblade

An easy to use and powerful chaos engineering experiment toolkit.(阿里巴巴开源的一款简单易用、功能强大的混沌实验注入工具)
https://chaosblade.io
Apache License 2.0
5.86k stars 934 forks source link

chaoblade inject CPU fullload fault to k8s pod,report error “cgroups load failed, controller is not supported #1042

Open xirs opened 4 weeks ago

xirs commented 4 weeks ago

Issue Description

chaoblade inject CPU fulload fault to k8s pod,report error “cgroups load failed, controller is not supported

Describe what happened (or what feature you want)

excute "blade create k8s container-cpu fullload --container-ids", report “cgroups load failed, controller is not supported”

img_v3_0275_4d9a3eeb-0d17-453e-a34c-47c09c349c2g

Describe what you expected to happen

inject successfully

How to reproduce it (as minimally and precisely as possible)

  1. Add " cgroup_no_v1=hugetlb" to kernel startup paramter, and then reboot img_v3_02bu_9b9ea6b2-5dbd-4ecb-8026-14da3ecb72eg
  2. after reboot, excute "blade create k8s container-cpu fullload --container-ids 6b6a43d506c21680ba1965b6caee0a36fdc63c265f2f7bcea8dc737ad6ad543e --kubeconfig ~/.kube/config --cpu-percent 48 --namespace xxxxxx --names xxxxxxx --timeout 100", it will report error

Tell us your environment

Debian GNU/Linux 10 Linux 5.15.120 with "cgroup_no_v1=hugetlb" in kernel paramter

blade vesion 1.7.0

Anything else we need to know?

image

It might be these codes to introduce this error. It will compare the difference between /sys/fs/cgroup/ and /proc/self/mountinfo. if there is a problem,it wirl report ErrControllerNotActive = errors.New("controller is not supported")

see https://github.com/containerd/cgroups/blob/be0e52032b497eef2fbfeb2c03f844721689b442/cgroup1/paths.go#L59

the differences between /sys/fs/cgroup/ and /proc/self/mountinfo (with cgroup_no_v1=hugetlb in kernel startup paramter) are image

so I guess it might need upgrade the containerd /cgroups dependency library