fsdp Search Results - Githubissues

1000+ results
for fsdp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

unslothai/unsloth #373

Qdora：a scalable and memory-efficient method to close the ga…

https://www.answer.ai/posts/2024-04-26-fsdp-qdora-llama3.html That looks awesome！

sorasoras updated 3 months ago
9
aws-samples/awsome-distributed-training #354

Warning for maximum sequence length when running FSDP Llama2…

In awsome-distributed-training/3.test_cases/10.FSDP, when running `sbatch 1.distributed-training.sbatch` ( [1.distributed-training.sbatch](https://github.com/aws-samples/awsome-distributed-training/bl…

amanshanbhag updated 1 month ago
1
lm-sys/FastChat #2809

ValueError: FSDP requires PyTorch >= 2.1.0

Hi， my env as belows : docker image : docker run --gpus all -it --net=host --ipc=host --ulimit memlock=-1 -v /home/ubuntu/test:/home/finetune -v /ssd/gyou:/models --name=vicuna nvcr.io/nvidia/pytor…

chaofanl updated 5 months ago
8
pytorch/pytorch #74041

FSDP does not work on GLOO backend

### 🐛 Describe the bug It uses _allgather_base, but there is no support for this in Gloo backend: ``` RuntimeError: no support for _allgather_base in Gloo process group ``` ### Versions main …

rohan-varma updated 1 month ago
10
pytorch/pytorch #127225

make_graphed_callables don't work with FSDP at all even on a…

### 🐛 Describe the bug I was trying to use CUDA graphs (`torch.cuda.make_graphed_callables`) on a model wrapped with `FullyShardedDataParallel` (FSDP) and I got the following error: ``` -- Process …

bastefaniak updated 1 day ago
6
huggingface/transformers #29476

feat: Add data class for fsdp config and use it along with a…

### Feature request Like the trainer arguments data class https://github.com/huggingface/transformers/blob/2a002d073a337051bdc3fbdc95ff1bc0399ae2bb/src/transformers/training_args.py#L167 Its goo…

kmehant updated 1 month ago
9
Lightning-AI/litgpt #1392

Address frozen parameter warning with FSDP on nightly torch

PEFT finetuning (LoRA, adapter) raises the following warning for each FSDP-wrapped layer (transformer block in our case): ```python The following parameters have requires_grad=True: ['transformer…

carmocca updated 2 months ago
2
facebookresearch/fairscale #677

[FSDP][benchmarks] Add regression benchmarks for the FSDP AP…

## 🐛 Bug We need to add regression benchmarks for the FSDP API and possible input combinations. These regression benchmarks should be added to [fairscale/benchmarks](https://github.com/facebookrese…

anj-s updated 3 years ago
5
AlibabaPAI/torchacc #20

xlarun --nproc_per_node=8 YOUR_MODEL.py

xlarun: command not found, I used the container you provided, but the command is not found.

a1342772 updated 4 weeks ago
6
facebookresearch/dinov2 #161

Launching train/train.py directly without Slurm

Hi, I am trying to launch `dinov2/train/train.py` script directly without the Slurm scheduler. I use the following command to launch the training: ``` export CUDA_VISIBLE_DEVICES=0,1 && python dino…

vladchimescu updated 1 month ago
23

上一页 1...13 14 15 16 17 18 19...100 下一页

1000+ results for fsdp

1000+ results
for fsdp