issues
search
huggingface
/
accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
https://huggingface.co/docs/accelerate
Apache License 2.0
7.34k
stars
875
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
load_checkpoint_and_dispatch does not work on nn.Modules with hook that manipulates tensors
#2919
captainst
opened
3 hours ago
0
Question about `float16` when load the model
#2918
AIR-hl
opened
6 hours ago
1
Support MUSA (Moore Threads GPU) backend in accelerate
#2917
fmo-mt
opened
8 hours ago
1
Allow multiple process per device
#2916
cifkao
opened
20 hours ago
1
AI assist to auto update scripts to use accelerate
#2915
samyakkkk
opened
1 day ago
1
Fix slowdown on init with `device_map="auto"`
#2914
muellerzr
closed
1 day ago
1
Multinode Inference using deepspeed zero3
#2913
NotTheStallion
opened
1 day ago
0
NCCL error during saving checkpoint with ds zero3
#2912
rubickkcibur
opened
1 day ago
1
Model loading extremely slow after upgrading from 0.31.0 to 0.32.0
#2911
chenyuteng01
closed
1 day ago
3
Model Memory Calculator For Different Input Token
#2910
ZorkJ
opened
1 day ago
3
[tests] fix bug in torch_device
#2909
faaany
closed
1 day ago
1
'from transformers import Trainer' can hinder multi-gpu training process on Jupyter.
#2908
SHEN2BAIYI
opened
2 days ago
8
Parameters out-of-sync in multi-GPU training, only 1st GPU actually contributes to training
#2907
parlance-zz
closed
2 days ago
0
ValueError: weight is on the meta device, we need a `value` to put in on 0.
#2906
nilsjohanbjorck
opened
2 days ago
1
False device placement when use with quantization_config
#2905
xinghaow99
opened
5 days ago
2
How to merge Qlora FSDP weights with an LLM and save model.
#2904
Minami-su
closed
4 days ago
2
With the same epoch, the result of multiple Gpus is much lower than that of a single gpu,why?
#2903
xiuguangLi
opened
1 week ago
1
Added a MultiCPU SLURM example using Accelerate Launch and MPIRun
#2902
okhleif-IL
closed
2 days ago
1
Problem on custom device_map
#2901
wonkyoc
opened
1 week ago
2
about run glm4 demo error
#2900
leizhu1989
opened
1 week ago
3
training loop freezes after first step on TPU
#2899
drimeF0
opened
1 week ago
2
Move to cpu takes extra memory usage after .gather()
#2898
xinghaow99
closed
6 days ago
7
Accelerate load_checkpoint_and_dispatch - RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1
#2897
adarsh-ks
closed
3 days ago
12
Feature Request: Pipeline multiple batches together for Llama3 70B distributed inference
#2896
ishan-gaur
closed
1 week ago
4
Add early support for `torchdata.stateful_dataloader.StatefulDataLoader` within the `Accelerator`
#2895
byi8220
opened
1 week ago
1
Importing `torchdata.stateful_dataloader` causes the test `check_seedable_sampler` to fail
#2894
byi8220
closed
2 days ago
9
notebook_launcher on kaggle tpu
#2893
lhiqwj173
opened
1 week ago
3
Add XLA Dynamo backends for training and inference
#2892
johnsutor
closed
2 days ago
1
How to set a custom Config in python code using Accelerate?
#2891
konstantinator
opened
1 week ago
1
More than 10 times slowdown between version 0.26.1 and version 0.31.0, EDIT: It was a data loading issue with Hugginface Datasets
#2890
marhlder
closed
1 week ago
6
Hotfix PyTorch Version Installation in CI Workflow for Minimum Version Matrix
#2889
yhna940
opened
1 week ago
9
Make `log_line_prefix_template` Optional in Elastic Launcher for Backward Compatibility
#2888
yhna940
closed
2 days ago
4
fix mlu device longTensor bugs
#2887
huismiling
closed
2 days ago
2
Can't apply LoRA's PiSSA weight init when using DeepSpeed ZeRO3 + LoRA to finetune!
#2886
ANYMS-A
closed
1 week ago
4
The saved model with deepspeed zero3 can not be correctly loaded
#2885
rubickkcibur
closed
1 week ago
2
Why is there a double fetch in the first batch when using accelerate?"
#2884
qsunyuan
opened
1 week ago
1
Add Profiler Support for Performance Analysis
#2883
yhna940
closed
3 days ago
3
accelerator.prepare just can be run jus once ?
#2882
DavideHe
opened
2 weeks ago
2
typo in examples/slurm/submit_multinode.sh script
#2881
hubutui
closed
2 weeks ago
2
Add ignore_unexpected_keys arg to load_checkpoint_in_model()
#2880
Qubitium
closed
2 weeks ago
1
fix `load_state_dict` for xpu and refine xpu safetensor version check
#2879
faaany
closed
2 days ago
5
add `require_triton` and enable `test_dynamo` work on xpu
#2878
faaany
closed
2 days ago
4
Some adjustment for supporting Deepspeed-Ulysses
#2877
zeyugao
opened
2 weeks ago
2
make more cuda-only tests device-agnostic
#2876
faaany
closed
2 days ago
6
Correct loading of models with shared tensors when using accelerator.load_state()
#2875
jkuntzer
opened
2 weeks ago
2
fix bug when getting the real accelerator's device number
#2874
faaany
closed
2 days ago
6
Plan to support FSDP2?
#2873
ByronHsu
opened
2 weeks ago
6
Accelerate test fails: Exception: Could not find the transformer layer class to wrap in the model
#2872
MikaSie
opened
2 weeks ago
8
Cannot free VRAM after loading a quantized model
#2871
lstein
opened
2 weeks ago
1
Support for Torch XLA Dynamo Backend
#2870
johnsutor
closed
2 days ago
1
Next