-
Introduce chaos testing as a way to test the stability and resiliency of Edge.
For example, Edge has an external dependency on the Kubernetes control plane. In larger environments, we need to be a…
-
After the recent reproposal revamp in https://github.com/cockroachdb/cockroach/pull/97779, we added `TestKVNemesisSingleNode_ReproposalChaos` to do reproposal chaos testing (e.g. inject various errors…
-
## Bug Report
**What version of Kubernetes are you using?**
1.26.12
**What version of Chaos Mesh are you using?**
2.6.3
**What did you do? / Minimal Reproducible Example**
We are trying to…
-
Test the following:
1. Node gets disconnected, makes progress and catch up
2. Two partitions, neither side makes progress and reconnects
-
## Why
Expectation: TMail core service should not be disrupted more or less by Redis outage.
## How
Experiment on preprod with what happens to TMail deployment if Redis is down.
Some related Redis …
-
There are many scenarios where sinks can be slow, whether it is due to an intentional throttling of the consumer or because of other quirks. Today, we do not handle these situations well.
Related: h…
-
At kubecon they showed how they ran litmus chaos tests in pipelines to prove that a commit made the application more/less resilient.
Litmus allows for defining "chaos tests", that can be run while d…
-
At kubecon they showed how they ran litmus chaos tests in pipelines to prove that a commit made the application more/less resilient.
Litmus allows for defining "chaos tests", that can be run while d…
-
At kubecon they showed how they ran litmus chaos tests in pipelines to prove that a commit made the application more/less resilient.
Litmus allows for defining "chaos tests", that can be run while d…
-
### Description
We're running a large-scale batch inference job on spot instances and trying to use as many GPUs as possible. We've observed this pattern for preemptions:
![image](https://github.com…