-
Dear authors,
I encountered weights explosion problems during integrating LoRA to torchtitan. I am running with train_configs/llama3_8b.toml configs with run_llama_train.sh on 4 A10 24GB GPUs. PyT…
-
**Description**
Tracking known issues during training with FSDP.
- Issue with resizing embedding dimensions in distributed train
- Behavior: This throws an exception with embedding sizes out of b…
-
### System Info
Platform: Linux-5.15.148.2-2.cm2-x86_64-with-glibc2.35
Python version: 3.10.14
Bitsandbytes version: 0.43.1
Safetensors version: 0.4.5
Accelerate version: 0.34.2
Accelerate con…
-
### Describe the bug
I have used /examples/text_to_image/train_text_to_image_sdxl.py to train a fine tune sdxl. I used accelerate 0.25.0 + FSDP, when I was saving a checkpoint it will stuck and can'…
-
### System Info
- `transformers` version: 4.41.2
- Platform: Linux-4.9.151-015.ali3000.alios7.x86_64-x86_64-with-glibc2.17
- Python version: 3.8.18
- Huggingface_hub version: 0.23.2
- Safetenso…
-
(Q)DoRA, an alternative to (Q)LoRA is quickly proving to be a superior technique in terms of closing the gap between FFT and PEFT.
Known existing implementations:
- https://github.com/huggingface/…
-
Hi,
Does Determined support the PyTorch FSDP way of distributed training? I can see examples for DeepSpeed, but I have a requirement to specifically use native FSDP feature of PyTorch 2.2 (something…
-
Dear developers!
Actually, my request belongs to [this](https://github.com/NU-CUCIS/ALIGNNTL) reporistory for the ALIGNN transfer learning project, but since I have not received a reply on the issue …
-
### 🚀 The feature, motivation and pitch
I trained the current code with FSDP to full fine-tune Llama2, it is very quick, but it turns out the performance is even worse than LoRA fine-tuned models u…
-
### System Info
```Shell
- `Accelerate` version: 0.29.1
- Platform: Linux-5.19.0-46-generic-x86_64-with-glibc2.35
- `accelerate` bash location: /home/yuchao/miniconda3/envs/TorchTTS/bin/accelerate
…