chaosblade-io / chaosblade

An easy to use and powerful chaos engineering experiment toolkit.(阿里巴巴开源的一款简单易用、功能强大的混沌实验注入工具)
https://chaosblade.io
Apache License 2.0
5.86k stars 934 forks source link

Create cri process kill failed #1046

Closed zexiplus closed 3 weeks ago

zexiplus commented 3 weeks ago

Issue Description

Type: bug report

Describe what happened (or what feature you want)

I started a container running httpd, with container ID 45f172395fbf. When I executed the command:

blade create cri process kill --process httpd --signal 15 --container-id 45f172395fbf

I received the following error:

{"code":63010,"success":false,"error":"`httpd`: get process id by name failed, err: `/opt/chaosblade/bin/nsexec -t 3460628 -p -m -- /bin/sh -c ps -eo user,pid,ppid,args | grep \"httpd\"  | grep -v -w blade | grep -v -w grep | grep -v -w chaos_killprocess | grep -v -w chaos_stopprocess | awk '{print $2}' | tr '\\n' ' '`: cmd exec failed, err: /bin/sh: ps: not found\n/bin/sh: grep: not found\n/bin/sh: grep: not found\n/bin/sh: grep: not found/bin/sh\n: grep: not found\n/bin/sh: grep: not found\n/bin/sh: awk: not found\n/bin/sh: tr: not found\n exit status 127"}

Describe what you expected to happen

I expected the blade create cri process kill command to successfully find and kill the httpd process in the specified container.

How to reproduce it (as minimally and precisely as possible)

  1. Start a container running httpd.
  2. Use the container ID 45f172395fbf.
  3. Execute the command:
blade create cri process kill --process httpd --signal 15 --container-id 45f172395fbf

Tell us your environment

Anything else we need to know?

The issue appears to be related to missing utilities in the container environment, such as ps, grep, awk, and tr. The error indicates that these commands are not found when the nsexec command is executed.


/bin/sh: ps: not found
/bin/sh: grep: not found
/bin/sh: grep: not found
/bin/sh: grep: not found
/bin/sh: grep: not found
/bin/sh: awk: not found
/bin/sh: tr: not found
exit status 127

This might be due to the lightweight nature of the container image (e.g., busybox or alpine), which does not include these utilities by default.