-
I want to do low cost error recovery from deep learning training failures. So I need to simulate some errors in my pytorch training file to test my system.
I find that DCGM has the ability of fau…
-
1. Response preparation fails with -ENOMEM/-EINVAL
2. Making headers/data/trailer frames with RES_TERM_STREAM (we should send postponed frames and other responses successfully)
3. Making headers/dat…
-
### Checklist
- [X] I've read the [contribution guidelines](https://github.com/autowarefoundation/autoware/blob/main/CONTRIBUTING.md).
- [X] I've searched other issues and no duplicate issues were…
xmfcx updated
1 month ago
-
I've been reading about [fuse](https://github.com/jlouis/fuse), a mature circuit breaker library for Erlang (a platform known for "resiliency by default").
In circuit breakers configuration, they h…
-
Most mature distributed system projects have some sort of fault injection frameworks for testing purposes. Some examples:
- Hadoop: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hd…
-
When my k8s service configured the limit for memory resources, setting the memory pressure test size to a percentage is. Fault injection cannot push up the memory, but the size can be set to the speci…
-
```
From SubmitChecklist:
21: Has been checked with injection of at least slab and page-allocation
failures. See Documentation/fault-injection/.
If the new code is substantial, addition of …
-
```
From SubmitChecklist:
21: Has been checked with injection of at least slab and page-allocation
failures. See Documentation/fault-injection/.
If the new code is substantial, addition of …
-
-
## Describe the feature
Currently, fault-injection cannot be implemented without delegating the failure test.
We would like to implement fault-injection functionality in the flagger.
ref: [link…