-
Hello,
We are encountering NPEs when using the Config provider for Secrets Manager, we haven't tried any other.
These seem to occur randomly, but frequently, when a worker or connector is started …
-
Hello,
I read your documentation and blog post and was interested in running KernelShap in distributed mode.
I've installed ray and alibi[ray] as in the documentation. Other packages were already i…
-
**Describe the bug**
[Docs](https://grafana.com/docs/tempo/latest/release-notes/v2-3/#changes-to-the-overrides-module-configuration) states that starting from version 2.3 there is a new `overrides` b…
-
### System Info
torch 2.4.1
transformers 4.46.0.dev0
trl 0.11.2
peft 0.13.1
GPU V100
CUDA …
-
## Summary
When attempting to resume a job from where it left off before reaching wall-time on a SLURM cluster using PyTorch Lightning, the ckpt_path="hpc" option causes an error if no HPC checkpoi…
-
g
Medium
# Attacker will prevent distribution of USDC to stakers through frequent reward updates
### Summary
USDC's lower precision of 6 decimals and frequent reward updates will cause stakers to …
-
### Describe the bug
I run the training but get this error
### Reproduction
Run accelerate config
```
compute_environment: LOCAL_MACHINE
debug: false
distributed_type: FSDP
downcast_bf16: 'n…
-
Hi, Thank you for your hard work.
I am evaluating 2d zetas(twisted with harmonic polynomials): the lemniscate.
just sum over x,y for (x^4 - 6. * x^2 * y^2 + y^4)/(x^2+y^2)^4
I can do this in nu…
-
# the code as follows:
```
CUDA_VISIBLE_DEVICES=1,2 accelerate launch train_flux_deepspeed_controlnet.py --config "train_configs/test_canny_controlnet.yaml"
```
# ERROE
```
The following values …
-
### 🐛 Describe the bug
Hello, I encountered some issues while using torch.distributed.pipelining.
I tested PiPPy/examples/huggingface/pippy_gpt2.py with the default configuration. Because I'm …