-
Currently, when using linear backoff with a delay of 0, failed jobs are retried on the same worker. However, in some scenarios, a worker might be on a malfunctioning machine, and we need the ability t…
-
### Emoji symbol
🛟
### Emoji code
:ring_buoy:
### Emoji description
Code related to fault tolerance
### Describe the use case of your emoji
Code used to prevent and manage errors, eg: retrying …
agea updated
11 months ago
-
### Summary
Currently, the `server` and `issuer` packages have server-like features that are not fault tolerant. If an error is thrown, it will bubble up to caller and kill the loop. We need to desig…
-
# Background
Dlrover is an elastic deep learning framework, with fault-tolerance of processes failure, POD losting etc. Since the LLM training is at large scale and always span for a long time, many …
-
We need to create Istio fault tolerance config rule to do Retry, Circuit breaker, limit the concurrent requests, timeout. set the MP_Fault_Tolerance_NonFallback_Enabled in configmap to disable MP FT …
-
@nmaludy asked to open an issue to discuss making certain retry parameters configurable.
The idea here is to make at least these two configurable via st2.conf:
https://github.com/StackStorm/st2/bl…
-
MicroProfile 7.0 is a major release with the current updated MicroProfile specifications:
1. MicroProfile Telemetry 2.0
2. MicroProfile Rest Client 4.0
3. MicroProfile OpenAPI 4.0
4. MicroProfile …
-
- [x] Proactive scheduling
- [x] Client assign return waiting time (avg round time), client repeatedly ping the job during waiting time
- [x] Proactive scheduling
- [x] Client TTL
- [ ] Clean up…
-
Fault tolerance annotations are silently ignored on fallback methods. For example:
```
@Fallback(fallbackMethod = "legacyFindBook")
public Book findBookInOnlineStore(String title) {
…
-
This will be likely to be broken into multiple issues but I wanted to open this to remember this important items as we will work on "stabilization" tasks towards a tape out. Within `resource` and `qm…