-
### 🐛 Describe the bug
When trying to add FSDP to our training code base that includes a pipelining scheme I encountered an issue if forward and backward passes are no longer interleaved but instead …
-
I have a setup with `torch.Lightning` where I'm using custom `torchmetrics.Metric` as loss function contributions. Now I want to be able to do it with `ddp` by setting `dist_sync_on_step=True`, but th…
-
## Reason
Enables changing sidesets from normals with distributed mesh.
## Design
Looks like it requires only a change or two in `SideSetsGeneratorBase` to check for remote elements.
## Impact…
-
Hi,
Does this Library work on distributed processing engine like Spark.
Thanks!
-
I'd like to start by saying I appreciate the work going into this plugin and that this issue is for informational purposes for others looking to uses in a similar use case as mine.
Using the TestS…
-
if I want to replace a multimodal dataset other than a thesis, I would like to know where is your read dataset processing class?
-
I tried some basic stuff with a dask_kubernetes on ocean.pangeo.io. No luck.
I created a cluster and connected to it, created a gdrivefs, and the tried to read / write via xarray. I immediately get…
-
**Describe what's wrong**
We're evaluating different ClickHouse cluster topoligies with the ClickHouse Operator but we seem be facing an issue specially in sharded clusters.
Currently we're work…
-
Hello,
We are in a process of evaluating the possibilities of using dask and dask distributed in our Analytics Platform. However, there are some legacy problems that force us to customize work's ex…
-
Thank you for the interesting works.
Can you describe the required computer specs for running the inference?
I tried to run the inference, but I kept getting device ordinal error.
`LOCAL_RANK` …