fsdp Search Results - Githubissues

1000+ results
for fsdp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #74588

[FSDP] using CPUOffload creates 3-10x slowdown due to slow …

### 🐛 Describe the bug Create simple distributed model Wrapper model with FSDP. Using stateful optimizer ala Adam(W) run without CPUoffload and profile/time. Then run with CPUOffload and see th…

lessw2020 updated 1 year ago
5
pytorch/pytorch #105729

[FSDP] using CPUOffload cannot make the code runing stop

### 🐛 Describe the bug I was tried to use offload fsdp to run the vicuna training, however, after I run this command. the code looks like keeping still for 3-4hours and prints nothing like and th…

JianqiaoLu updated 1 year ago
1
pytorch/pytorch #96670

Harden composable fully_shard: Checklist

### 🚀 The feature, motivation and pitch The following are features that should be checked / hardened in order to roll out fully_shard as an alternative to class-based FSDP: [ ] Test with ShardedGr…

rohan-varma updated 1 year ago
1
speed1313/jax-llm #3

distributed training

speed1313 updated 4 months ago
2
meta-llama/llama-recipes #711

Convert Llama-3.2-11B-Vision-Instruct FSDP Checkpoints to HF…

### System Info transformers: '4.45.1' ### Information - [ ] The official example scripts - [X] My own modified scripts ### 🐛 Describe the bug I have fine-tuned `Llama-3.2-11B-Vision-Instruct` fo…

marscod updated 1 week ago
4
philschmid/llm-sagemaker-sample #22

Issue when continuing fine-tuning

Hi and thanks for the great resources. I used "train-deploy-llama3.ipynb" and trained a similar Llama3 model as shown in the notebook. I pushed my model on hugging face and now I want to use that …

MikeMpapa updated 3 months ago
3
meta-llama/llama #642

torchrun

Unable to run torchrun --nnodes 1 --nproc_per_node 4 llama_finetuning.py --enable_fsdp --use_peft --peft_method lora --model_name /patht_of_model_folder/7B --pure_bf16 --output_dir Path/to/save/PEFT/…

rbenami-cell updated 1 year ago
1
Lightning-AI/lightning-thunder #1175

Thunder seems to use way more memory when `litgpt.Config.par…

## 🐛 Bug When input sequences get longer, Thunder seems to tend to use more memory than eager and torch.compile. Let's take litgpt's `stablecode-completion-alpha-3b` as an example whose sequen…

crcrpar updated 2 weeks ago
5
sahil280114/codealpaca #8

bug: get empty state dict

I follow the step in README, but I get the empty state dict. Here is the code and the output: code: ```python trainer = Trainer(model=model, tokenizer=tokenizer, args=training_args, **data_module) …

Anditty updated 1 year ago
1
pytorch/pytorch #109774

[DDP + Dynamo] Tracing DDP AllReduce (Compiled DDP)

### 🚀 The feature, motivation and pitch **Background** DistributedDataParallel (DDP) uses `Reducer` to bucket and issue `allreduce` calls. The main entry point of `Reducer` is through the gradient …

fegin updated 3 months ago
9

上一页 1...49 50 51 52 53 54 55...100 下一页

1000+ results for fsdp

1000+ results
for fsdp