fsdp Search Results - Githubissues

1000+ results
for fsdp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/ao #47

[RFC] Plans for torchao

### Summary Last year, we released [pytorch-labs/torchao](https://github.com/pytorch-labs/ao) to provide acceleration of Generative AI models using native PyTorch techniques. Torchao added support …

supriyar updated 5 months ago
21
pytorch/ao #652

FSDP 2 low bit optim broken on pytorch nightlies

To repro: `python test/prototype/test_low_bit_optim.py TestFSDP2.test_fsdp2` Logs ``` - Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html =========================== short…

msaroufim updated 1 month ago
15
allenai/OLMoE #8

MOE Export Parallelism Training Script

Hello OLMoE team, I’m currently exploring training scripts for models using Mixture of Experts (MOE) and was wondering if there are any existing or planned scripts that handle expert parallelism du…

wdlctc updated 1 month ago
5
pytorch/pytorch #127697

[dynamo] Issue with construction nn.Parameter

### 🐛 Describe the bug With https://github.com/pytorch/pytorch/pull/126578, I am seeing many issues with dynamo tracing of nn.Parameter construction. My PR does not do anything special with nn.Param…

anijain2305 updated 4 months ago
5
pytorch/pytorch #129457

[PT2][fp8][FSDP2] compile the function that pre-computes fp8…

### 🚀 The feature, motivation and pitch share repro for @bdhirsh , @tugsbayasgalan on the gaps of torch.compile for FSDP2 fp8 all-gather for FSDP2 fp8 all-gather, it's criticial to pre-compute ama…

weifengpy updated 1 month ago
3
facebookresearch/dinov2 #110

Is there a DDP training code?

Thank you for sharing the fantastic work. As I do not have the SLURM cluster, Is there the DDP training code? Or anyone can help?

deropty updated 1 month ago
7
huggingface/accelerate #3065

Recommend dropping MS-AMP support

As someone who used this library for a while in prod, then gave up, I'd honestly recommend just dropping it to simplify the code. There are several issues: - it isn't being very actively maintaine…

rationalism updated 2 days ago
4
axolotl-ai-cloud/axolotl #1799

Zamba2AttentionDecoderLayer.forward() takes from 4 to 10 pos…

### Please check that this issue hasn't been reported before. - [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports. ###…

lucyknada updated 2 months ago
2
lm-sys/FastChat #2465

[Fine-Tuning Fail]: Problem Running FastChat T5 fine-tuning

#### I'm attempting to fine-tuning FastChat T5 locally using the command: torchrun --nproc_per_node=1 --master_port=9778 fastchat/train/train_flant5.py \ --model_name_or_path {my_path}/test_fa…

pcchen-ntunlp updated 11 months ago
1
openxla/xla #11824

call ptxas become defunct, cause xla hung

I'm running Llama-2-1.7b-hf +fsdp+xla but process show ` 523777 517263 0 80 0 - 0 - 10:22 ? 00:00:00 [ptxas] ` I have using gdb to debug process: `517263` ,showing this backtrace…

zjjott updated 5 months ago
2

上一页 1...88 89 90 91 92 93 94...100 下一页

1000+ results for fsdp

1000+ results
for fsdp