-
After a common section the workshop is divided into tracks:
* observability
* event-driven and reactive
* fault-tolerance
* Quarkus extension
An introduction section should describe these and…
-
### What happened + What you expected to happen
The RayData job hangs and can not recover from faults.
### Versions / Dependencies
The latest master branch.
macOS Monterey
### Reproduction scri…
-
To improve the fault tolerance of stream system, we propose to use short-lived stream instead of long-running stream in production. Short-lived stream are scheduled regularly just like scheduled batch…
-
[DeepSpeed](https://github.com/microsoft/DeepSpeed) is an excellent framework for training LLMs on a large scale, while the mpi-operator is the ideal tool to facilitate this within the Kubernetes ecos…
-
add a checkpointing/restarting system with a checksum system as in SPECFEM3D_GLOBE/tags/v4.1.0_beta_merged_mesher_solver_non_blocking_MPI/src , or using the fault tolerance library developed by Leonar…
-
```
When I saw the scalaris source code, I found several places where the values of
number of replicas and the size of majority are hardcoded. At the same time,
these parameters are described in the…
-
Example: https://github.com/raiden-network/microraiden/issues/138#issuecomment-347636141
> Please include failure cases/fault tolerance schemes! (receiving node offline, contestation trees, etc)
…
-
Hi,
I was looking to use SmallRye Fault Tolerance to Apache TomEE for MicroProfile Fault Tolerance.
Unfortunatly I found that it does not work. I wanted to provide a TCK runner within a profile …
-
https://medium.com/eosio/dpos-bft-pipelined-byzantine-fault-tolerance-8a0634a270ba
https://github.com/EOSIO/eos/issues/2718#issuecomment-389222260
https://lamport.azurewebsites.net/pubs/paxos-si…
-
My cursory read-through of the codebase leads to the understanding that TreeWidth is on the CustomSlurmSettings' Deny list due to clusters deployed with the ec2hostnames DNS configuration. As in that …