-
Hi maintainers,
I'd like to propose adding distributed circuit breaker support to Sony/gobreaker by allowing the circuit breaker state (open/closed/half-open) to be stored in an external datastore …
-
### Please describe your problem in detail
I'm trying to start a pytorch training using volcano and pytorch plugin. I have 2 nodes, each with 8 gpus.
I found that volcano sets WORLD_SIZE = 2, RANK …
-
### Proposal:
Currently, when a table does not exist or it does not support inserts, we have the following error:
```
table 'hello2' absent, or does not support INSERT
```
Unfortunately, we n…
-
At present, a distributed table can only manage select and update operations. The goal is to enable it to process insert, replace, and delete operations as well.
Here is a high-level specification …
-
Context :- I am trying to run distributed training on 2 A-100 gpus with 40GB of VRAM. The batch size is 3 and gradient accumulation=1. I have attached the config file below for more details and the er…
-
Hi all,
I've been trying to use a `ConstantFESpace` with the GridapDistributed.jl package, but within the definition of the reference space for the `ConstantFESpace`, I get an error that this funct…
-
Hi! I downloaded elm-dev (by way of elm-prefab) and the package only downloads a single exe which does not work. I debugged this a little and found that there's some dynamic libraries missing on my sy…
-
This library shows great promise as a replacement for Akka within the Cats ecosystem. However, one key feature is missing: Actor Clustering and a mechanism for distributed workflows.
Is there any pla…
-
In the release notes of loki 3.0.0 the following Feature/enhancement is listed.
> Helm charts: A major upgrade to the Loki helm chart introduces support for Distributed mode (microservices), includ…
-
### 🚀 The feature, motivation and pitch
We have a `DistributedSampler` and we have a `WeightedRandomSampler`, but we don't have a distributed weighted sampler, to be used in say Distributed Data Pa…