We found that some test cases generated by Acto may contain misconfiguration. Here is an example of a mutation from state 0 to state 1. In the following example (See CRD Definition), Acto add an override of livenessProbe to the custom resource, which is invalid because rabbitmq will not use the port 8500. Therefore, Kubernetes will constantly kill the pod because the pod cannot pass the liveness check.
There are also many similar cases in the alarm report, such as an invalid image name and a missing field. The issue is intended to solve this problem, or at least mitigate the problem.
What we could do
Improve the test cases generated by Acto.
Collect events and logs from kubernetes, and classify the alarms.
Improve the test cases generated by Acto
TBD
Collect events (and logs) from kubernetes, and classify the alarms.
The event indicates that the pod has a invalid config and could not be created, which is different from a crash event.
We think such kind of event may indicate a misconfiguration.
Warning FailedCreate 50s (x19 over 5m40s) statefulset-controller create Pod test-cluster-server-2 in StatefulSet test-cluster-server failed error: Pod "test-cluster-server-2" is invalid: spec.containers[0].image: Required value
What we met
We found that some test cases generated by Acto may contain misconfiguration. Here is an example of a mutation from state 0 to state 1. In the following example (See CRD Definition), Acto add an override of livenessProbe to the custom resource, which is invalid because rabbitmq will not use the port 8500. Therefore, Kubernetes will constantly kill the pod because the pod cannot pass the liveness check.
There are also many similar cases in the alarm report, such as an invalid image name and a missing field. The issue is intended to solve this problem, or at least mitigate the problem.
What we could do
Improve the test cases generated by Acto
TBD
Collect events (and logs) from kubernetes, and classify the alarms.
The event indicates that the pod has a invalid config and could not be created, which is different from a crash event. We think such kind of event may indicate a misconfiguration.
CRD Definition
Mutation:
Use the following custom resource to demonstrate. State 0:
State 1: