fsdp Search Results - Githubissues

1000+ results
for fsdp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/torchtitan #421

LoRA fine-tuning weights explosion in FSDP training

Dear authors, I encountered weights explosion problems during integrating LoRA to torchtitan. I am running with train_configs/llama3_8b.toml configs with run_llama_train.sh on 4 A10 24GB GPUs. PyT…

MinghaoYan updated 3 months ago
12
facebookresearch/ParlAI #4518

FSDP Issues Tracker

**Description** Tracking known issues during training with FSDP. - Issue with resizing embedding dimensions in distributed train - Behavior: This throws an exception with embedding sizes out of b…

Rebecca-Qian updated 2 years ago
2
bitsandbytes-foundation/bitsandbytes #1381

Paged optimizer resuming from checkpoint - attributeError: '…

### System Info Platform: Linux-5.15.148.2-2.cm2-x86_64-with-glibc2.35 Python version: 3.10.14 Bitsandbytes version: 0.43.1 Safetensors version: 0.4.5 Accelerate version: 0.34.2 Accelerate con…

shivam15s updated 6 days ago
1
huggingface/diffusers #6705

accelerate + FSDP + T2I train saving ckpt error

### Describe the bug I have used /examples/text_to_image/train_text_to_image_sdxl.py to train a fine tune sdxl. I used accelerate 0.25.0 + FSDP, when I was saving a checkpoint it will stuck and can'…

Forainest updated 5 months ago
8
huggingface/transformers #31278

Stuck on Initializing Transformers Model with FSDP (Fully Sh…

### System Info - `transformers` version: 4.41.2 - Platform: Linux-4.9.151-015.ali3000.alios7.x86_64-x86_64-with-glibc2.17 - Python version: 3.8.18 - Huggingface_hub version: 0.23.2 - Safetenso…

jiangjiadi updated 1 day ago
10
pytorch/torchtune #893

[FR] (Q)DoRA

(Q)DoRA, an alternative to (Q)LoRA is quickly proving to be a superior technique in terms of closing the gap between FFT and PEFT. Known existing implementations: - https://github.com/huggingface/…

DreamGenX updated 1 month ago
7
determined-ai/determined-examples #19

Requesting example to use PyTorch FSDP

Hi, Does Determined support the PyTorch FSDP way of distributed training? I can see examples for DeepSpeed, but I have a requirement to specifically use native FSDP feature of PyTorch 2.2 (something…

abdulmuneer updated 5 months ago
1
usnistgov/alignn #155

Does ALIGNNTL works with ALIGNN 2024.2.4

Dear developers! Actually, my request belongs to [this](https://github.com/NU-CUCIS/ALIGNNTL) reporistory for the ALIGNN transfer learning project, but since I have not received a reply on the issue …

antonf-ekb updated 4 months ago
5
meta-llama/llama-recipes #536

DeepSpeed support for Full Finetuning - FSDP performance is …

### 🚀 The feature, motivation and pitch I trained the current code with FSDP to full fine-tune Llama2, it is very quick, but it turns out the performance is even worse than LoRA fine-tuned models u…

waterluck updated 1 week ago
3
huggingface/accelerate #2671

Unable to specify HYBRID_SHARD for FSDP which requires proce…

### System Info ```Shell - `Accelerate` version: 0.29.1 - Platform: Linux-5.19.0-46-generic-x86_64-with-glibc2.35 - `accelerate` bash location: /home/yuchao/miniconda3/envs/TorchTTS/bin/accelerate …

npuichigo updated 1 month ago
6

上一页 1...9 10 11 12 13 14 15...100 下一页

1000+ results for fsdp

1000+ results
for fsdp