fsdp Search Results - Githubissues

1000+ results
for fsdp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/torchtitan #651

FP8Linear saves new parameters in ckpt and I cannot load the…

``` [rank0]: Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. [rank0]: Traceback (most…

goldhuang updated 1 week ago
6
datawhalechina/self-llm #189

Lora微调反向传播loss.backward()报错

在使用Lora微调模型时，在执行Trainer中反向传播函数loss.backward()时会报错"element 0 of tensors does not require grad and does not have a grad_fn"，查询相关教程有说在该该句前添加loss.requires_grad(True),这样做后确实不再报错，但参数也不再更新，请问有什么解决方案吗？

meteoryet updated 5 months ago
3
huggingface/accelerate #3061

FSDP miconfigurations

### System Info ```Shell Latest main version, torch nightly, cuda 12.6 ``` ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] One of t…

evkogs updated 1 month ago
8
rioyokotalab/nccl-reader #3

[fujii]読み進めメモ 2024

[Llama-3論文](https://arxiv.org/abs/2407.21783)の3.3.3 Collective Communication、3.3.4 Reliability and Operational Challenges における、NCCLXに類似する機能を作りたいモチベーション

okoge-kaz updated 3 months ago
5
facebookresearch/optimizers #23

Empty params in FSDP cause issue

Hi all, first of all, thanks for your great work! I have issue when trying to use the optimizer with FSDP training. The error is ` optimizer = DistributedShampoo( File "/root/slurm/src/opti…

odegeasslbc updated 4 weeks ago
2
pytorch/xla #3510

New all-gather API takes much more memory than 1.10 all-gath…

## 🐛 Bug In our internal tests, the new `xm.all_gather` API implemented in https://github.com/pytorch/xla/pull/3275 is shown to take significantly more memory to execute than the previous all-gathe…

ronghanghu updated 2 years ago
21
axolotl-ai-cloud/axolotl #1799

Zamba2AttentionDecoderLayer.forward() takes from 4 to 10 pos…

### Please check that this issue hasn't been reported before. - [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports. ###…

lucyknada updated 3 months ago
2
pytorch/ao #633

[RFC] Add LayerSkip to AO

Tracker issue for adding [LayerSkip](https://arxiv.org/abs/2404.16710) to AO. This is a training and inference optimization that is similar to layer-wise pruning. It's particularly interesting for…

jcaip updated 3 months ago
3
X-LANCE/SLAM-LLM #101

About FSDP，deepspeech.

Hello， May I know if the current FSDP and DeepSpeech are stable and available for use? Do they support multi-machine multi-card and LORA fine-tuning?

Alex-Songs updated 3 weeks ago
5
pytorch/xla #8402

Kaggle Notebook: model return loss None on TPU

## ❓ Questions and Help Hi, I recieved loss None when training model. Anyone can help? Simple reproduct kaggle notebook [link](https://www.kaggle.com/code/liondude/notebook548442067d) ``` im…

manh3152924 updated 1 day ago
1

上一页 1...89 90 91 92 93 94 95...100 下一页

1000+ results for fsdp

1000+ results
for fsdp