-
## 🚀 Description
Pipeline parallelism is a technique used in deep learning model training to improve efficiency and reduce the training time of large neural networks. Here we propose a pipeline paral…
-
[fsdp_qlora.txt](https://github.com/user-attachments/files/15917513/fsdp_qlora.txt)
The loss is returned as NaN when using DataCollatorForCompletionOnlyLM with the FSDP pipeline (attached for referen…
-
### Feature request
Context: https://github.com/huggingface/transformers/pull/29588#discussion_r1523510004
### Motivation
The layer-wise optimizers is not GaLore-specific. We could apply it t…
-
Hi everyone,
I tried to reproduce the finetuning of the alpaca, but I met follow error. Could you please help me?
```python
Running command git clone --quiet https://github.com/huggingface/t…
-
Currently torchao QAT has two APIs, [tensor subclasses](https://github.com/pytorch/ao/blob/a4221df5e10ff8c33854f964fe6b4e00abfbe542/torchao/quantization/prototype/qat/api.py#L41) and [module swap](htt…
-
### System Info
```Shell
- `Accelerate` version: 0.29.2
- Platform: Linux-5.15.0-106-generic-x86_64-with-glibc2.35
- `accelerate` bash location: /usr/local/venv/bin/accelerate
- Python version:…
ojh31 updated
2 months ago
-
Hello, I was wandering whether you are planning on releasing a script to convert weights trained with this repository to the huggingface format?
Currently, huggingface is the best way to share mode…
-
(ft_emb) b405@b405-CVN-Z790-GAMING-FROZEN:/media/b405/新加卷1/Workspace_linux/b405/ZH/llmProjects/FlagEmbedding$ torchrun --nproc_per_node 1 \
-m FlagEmbedding.reranker.run \
--output_dir /media/b405/新…
-
## 🚀 Feature
Make DDP/FSDP a regular transform (to a large part including making transforms flexible enough to support this).
### Motivation
Currently DDP/FSDP is not a regular transform, lea…
-
Hi, will you release a method for model parallel training of multiple GPUs